Make benchmark compatible with dictionary compression #808

rkoradi · 2019-11-04T21:13:18Z

Support the -D command line option for running benchmarks. The
benchmark code was slightly restructured to factor out the calls
that need to be different for each benchmark scenario. Since there
are now 4 scenarios (all combinations of fast/HC and with/without
dictionary), the logic was getting somewhat convoluted otherwise.

This was done by extending the compressionParameters struct that
previously contained just a single function pointer. It now
contains 4 function pointers for init/reset/compress/cleanup,
with the related state. The functions get a pointer to the
structure as their first argument (inspired by C++), so that they
can access the state values in the struct.

Cyan4973 · 2019-11-04T21:37:06Z

Quick question :
Is there a difference in measured performance
for "normal" compression scenarios
before and after this patch ?

rkoradi · 2019-11-04T21:53:17Z

Is there a difference in measured performance
for "normal" compression scenarios
before and after this patch ?

Yes, I do see differences. Even with using LZ4_attach_dictionary, compression is slower, particularly for smaller file sizes. Not surprisingly, decompression is quite a bit slower (also more pronounced for smaller files and larger dictionaries), since the dictionary is read for each iteration.

I can provide some numbers if you let me know what sizes/options should be tested. I tested with a 64K dictionary and a 450K file, and everything was roughly 5% slower.

Cyan4973 · 2019-11-04T21:55:10Z

The question was more about comparison of measured performance
when not using a dictionary.

Cyan4973 · 2019-11-04T22:11:28Z

The general concept of providing function pointers in a struct payload looks good to me.

Now, we mostly need to care for performance.

Also, the code should be C90 compatible, I guess the majority of compilation warnings come from this requirement.

rkoradi · 2019-11-04T22:41:47Z

Sorry, I misunderstood the question. The same functions as before are used for the non-dictionary case. So I wouldn't expect any differences there. Comparing the runs (for a 450K file again), there are minor differences that do appear repeatable. For level 3, compression is reported as typically about 1% slower than before the change. For level 1 it tends to be a tiny bit higher, though.

rkoradi · 2019-11-04T23:42:53Z

Finally got all the CI builds running cleanly. Sorry about the multiple attempts.

Cyan4973 · 2019-11-05T00:43:51Z

I can confirm a very small compression speed loss when comparing dev branch with your PR.
On my test system, it's < 1% at level 1, though well reproducible.

It's probably not a big blocker. But let's use a bit of time to understand why this happens.

Support the -D command line option for running benchmarks. The benchmark code was slightly restructured to factor out the calls that need to be different for each benchmark scenario. Since there are now 4 scenarios (all combinations of fast/HC and with/without dictionary), the logic was getting somewhat convoluted otherwise. This was done by extending the compressionParameters struct that previously contained just a single function pointer. It now contains 4 function pointers for init/reset/compress/cleanup, with the related state. The functions get a pointer to the structure as their first argument (inspired by C++), so that they can access the state values in the struct.

rkoradi · 2019-11-06T07:48:23Z

The updated version I just pushed runs at least as fast as the original version for me. It actually ran slightly faster in my test runs.

Based on experimentation, I found that calling LZ4_freeStream/LZ4_freeStreamHC unnecessarily for the cases where no streams are used was correlated to the slowdown. Frankly, I have no idea why, because they do nothing when a null stream is passed in. But with such a small difference, it's difficult to root cause it.

Cyan4973 · 2019-11-06T17:36:52Z

Thanks @rkoradi ! Great work and great investigation !

benchmark dictionary mode, implemented in #808, is incorrect. It would desynchronize compression and decompression as soon as 2+ files are dictionary compressed. Also : slightly improved traces system, to also include __LINE__ number

rkoradi force-pushed the benchmarkWithDictionary branch 2 times, most recently from 3d40233 to ee6d788 Compare November 4, 2019 23:11

rkoradi force-pushed the benchmarkWithDictionary branch from ee6d788 to 8d1a59e Compare November 6, 2019 07:33

rkoradi force-pushed the benchmarkWithDictionary branch from 8d1a59e to cc91777 Compare November 6, 2019 07:38

Cyan4973 approved these changes Nov 6, 2019

View reviewed changes

Cyan4973 merged commit 6e0d2be into lz4:dev Nov 6, 2019

Cyan4973 mentioned this pull request Sep 15, 2022

fix benchmark mode using Dictionary #1168

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make benchmark compatible with dictionary compression #808

Make benchmark compatible with dictionary compression #808

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Make benchmark compatible with dictionary compression #808

Make benchmark compatible with dictionary compression #808

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!