8000 Make benchmark compatible with dictionary compression by rkoradi · Pull Request #808 · lz4/lz4 · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Make benchmark compatible with dictionary compression #808

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 6, 2019

Conversation

rkoradi
Copy link
Contributor
@rkoradi rkoradi commented Nov 4, 2019

Support the -D command line option for running benchmarks. The
benchmark code was slightly restructured to factor out the calls
that need to be different for each benchmark scenario. Since there
are now 4 scenarios (all combinations of fast/HC and with/without
dictionary), the logic was getting somewhat convoluted otherwise.

This was done by extending the compressionParameters struct that
previously contained just a single function pointer. It now
contains 4 function pointers for init/reset/compress/cleanup,
with the related state. The functions get a pointer to the
structure as their first argument (inspired by C++), so that they
can access the state values in the struct.

@Cyan4973
Copy link
Member
Cyan4973 commented Nov 4, 2019

Quick question :
Is there a difference in measured performance
for "normal" compression scenarios
before and after this patch ?

@rkoradi
Copy link
Contributor Author
rkoradi commented Nov 4, 2019

Is there a difference in measured performance
for "normal" compression scenarios
before and after this patch ?

Yes, I do see differences. Even with using LZ4_attach_dictionary, compression is slower, particularly for smaller file sizes. Not surprisingly, decompression is quite a bit slower (also more pronounced for smaller files and larger dictionaries), since the dictionary is read for each iteration.

I can provide some numbers if you let me know what sizes/options should be tested. I tested with a 64K dictionary and a 450K file, and everything was roughly 5% slower.

@Cyan4973
Copy link
Member
Cyan4973 commented Nov 4, 2019

The question was more about comparison of measured performance
when not using a dictionary.

@Cyan4973
Copy link
Member
Cyan4973 commented Nov 4, 2019

The general concept of providing function pointers in a struct payload looks good to me.

Now, we mostly need to care for performance.

Also, the code should be C90 compatible, I guess the majority of compilation warnings come from this requirement.

@rkoradi
Copy link
Contributor Author
rkoradi commented Nov 4, 2019

Sorry, I misunderstood the question. The same functions as before are used for the non-dictionary case. So I wouldn't expect any differences there. Comparing the runs (for a 450K file again), there are minor differences that do appear repeatable. For level 3, compression is reported as typically about 1% slower than before the change. For level 1 it tends to be a tiny bit higher, though.

@rkoradi rkoradi force-pushed the benchmarkWithDictionary branch 2 times, most recently from 3d40233 to ee6d788 Compare November 4, 2019 23:11
@rkoradi
Copy link
Contributor Author
rkoradi commented Nov 4, 2019

Finally got all the CI builds running cleanly. Sorry about the multiple attempts.

@Cyan4973
Copy link
Member
Cyan4973 commented Nov 5, 2019

I can confirm a very small compression speed loss when comparing dev branch with your PR.
On my test system, it's < 1% at level 1, though well reproducible.

It's probably not a big blocker. But let's use a bit of time to understand why this happens.

@rkoradi rkoradi force-pushed the benchmarkWithDictionary branch from ee6d788 to 8d1a59e Compare November 6, 2019 07:33
Support the -D command line option for running benchmarks. The
benchmark code was slightly restructured to factor out the calls
that need to be different for each benchmark scenario. Since there
are now 4 scenarios (all combinations of fast/HC and with/without
dictionary), the logic was getting somewhat convoluted otherwise.

This was done by extending the compressionParameters struct that
previously contained just a single function pointer. It now
contains 4 function pointers for init/reset/compress/cleanup,
with the related state. The functions get a pointer to the
structure as their first argument (inspired by C++), so that they
can access the state values in the struct.
@rkoradi rkoradi force-pushed the benchmarkWithDictionary branch from 8d1a59e to cc91777 Compare November 6, 2019 07:38
@rkoradi
Copy link
Contributor Author
rkoradi commented Nov 6, 2019

The updated version I just pushed runs at least as fast as the original version for me. It actually ran slightly faster in my test runs.

Based on experimentation, I found that calling LZ4_freeStream/LZ4_freeStreamHC unnecessarily for the cases where no streams are used was correlated to the slowdown. Frankly, I have no idea why, because they do nothing when a null stream is passed in. But with such a small difference, it's difficult to root cause it.

@Cyan4973
Copy link
Member
Cyan4973 commented Nov 6, 2019

Thanks @rkoradi ! Great work and great investigation !

@Cyan4973 Cyan4973 merged commit 6e0d2be into lz4:dev Nov 6, 2019
Cyan4973 added a commit that referenced this pull request Sep 15, 2022
benchmark dictionary mode, implemented in #808, is incorrect.
It would desynchronize compression and decompression as soon as 2+ files are dictionary compressed.

Also : slightly improved traces system, to also include __LINE__ number
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0