-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Make benchmark compatible with dictionary compression #808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Quick question : |
Yes, I do see differences. Even with using LZ4_attach_dictionary, compression is slower, particularly for smaller file sizes. Not surprisingly, decompression is quite a bit slower (also more pronounced for smaller files and larger dictionaries), since the dictionary is read for each iteration. I can provide some numbers if you let me know what sizes/options should be tested. I tested with a 64K dictionary and a 450K file, and everything was roughly 5% slower. |
The question was more about comparison of measured performance |
The general concept of providing function pointers in a Now, we mostly need to care for performance. Also, the code should be |
Sorry, I misunderstood the question. The same functions as before are used for the non-dictionary case. So I wouldn't expect any differences there. Comparing the runs (for a 450K file again), there are minor differences that do appear repeatable. For level 3, compression is reported as typically about 1% slower than before the change. For level 1 it tends to be a tiny bit higher, though. |
3d40233
to
ee6d788
Compare
Finally got all the CI builds running cleanly. Sorry about the multiple attempts. |
I can confirm a very small compression speed loss when comparing It's probably not a big blocker. But let's use a bit of time to understand why this happens. |
ee6d788
to
8d1a59e
Compare
Support the -D command line option for running benchmarks. The benchmark code was slightly restructured to factor out the calls that need to be different for each benchmark scenario. Since there are now 4 scenarios (all combinations of fast/HC and with/without dictionary), the logic was getting somewhat convoluted otherwise. This was done by extending the compressionParameters struct that previously contained just a single function pointer. It now contains 4 function pointers for init/reset/compress/cleanup, with the related state. The functions get a pointer to the structure as their first argument (inspired by C++), so that they can access the state values in the struct.
8d1a59e
to
cc91777
Compare
The updated version I just pushed runs at least as fast as the original version for me. It actually ran slightly faster in my test runs. Based on experimentation, I found that calling LZ4_freeStream/LZ4_freeStreamHC unnecessarily for the cases where no streams are used was correlated to the slowdown. Frankly, I have no idea why, because they do nothing when a null stream is passed in. But with such a small difference, it's difficult to root cause it. |
Thanks @rkoradi ! Great work and great investigation ! |
benchmark dictionary mode, implemented in #808, is incorrect. It would desynchronize compression and decompression as soon as 2+ files are dictionary compressed. Also : slightly improved traces system, to also include __LINE__ number
Support the -D command line option for running benchmarks. The
benchmark code was slightly restructured to factor out the calls
that need to be different for each benchmark scenario. Since there
are now 4 scenarios (all combinations of fast/HC and with/without
dictionary), the logic was getting somewhat convoluted otherwise.
This was done by extending the compressionParameters struct that
previously contained just a single function pointer. It now
contains 4 function pointers for init/reset/compress/cleanup,
with the related state. The functions get a pointer to the
structure as their first argument (inspired by C++), so that they
can access the state values in the struct.