8000 enhancement(deps): Use zlib-rs for much faster zlib decoding/encoding by JakubOnderka · Pull Request #22533 · vectordotdev/vector · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

enhancement(deps): Use zlib-rs for much faster zlib decoding/encoding #22533

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Apr 8, 2025

Conversation

JakubOnderka
Copy link
Contributor
@JakubOnderka JakubOnderka commented Feb 27, 2025

Summary

zlib-rs is much faster for library for zlib decoding/encoding than default miniz_oxide and it is even faster than zlib-ng (https://trifectatech.org/blog/zlib-rs-is-faster-than-c/)

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

How did you test this PR?

Standard tests

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the "no-changelog" label to this PR.

Checklist

  • Please read our Vector contributor resources.
    • make check-all is a good command to run locally. This check is
      defined here. Some of these
      checks might not be relevant to your PR. For Rust changes, at the very least you should run:
      • cargo fmt --all
      • cargo clippy --workspace --all-targets -- -D warnings
      • cargo nextest run --workspace (alternatively, you can run cargo test --all)
  • If this PR introduces changes Vector dependencies (modifies Cargo.lock), please
    run dd-rust-license-tool write to regenerate the license inventory and commit the changes (if any). More details here.

References

vectordotdev/vrl#1301

@JakubOnderka JakubOnderka requested a review from a team as a code owner February 27, 2025 15:21
@JakubOnderka
Copy link
Contributor Author

dd-rust-license-tool write is failing:

dd-rust-license-tool write
Error: Could not read "/Users/xxx/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/allocator-api2-0.2.16/license"

Caused by:
    Is a directory (os error 21)

Copy link
Member
@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/ci-run-all

@jorgehermo9
Copy link
Contributor

I think adding a changelog to promote this would be cool! These days there is a lot of hype around this https://www.reddit.com/r/rust/comments/1ixt1ei/zlibrs_is_faster_than_c_trifecta_tech_foundation/

@JakubOnderka JakubOnderka force-pushed the zlib-rs branch 2 times, most recently from 69c0ca7 to dbc7c24 Compare March 5, 2025 13:00
@JakubOnderka
Copy link
Contributor Author

Rebased to master.

@pront pront changed the title chore: Use zlib-rs for much faster zlib decoding/encoding performance(deps): Use zlib-rs for much faster zlib decoding/encoding Mar 5, 2025
@pront pront changed the title performance(deps): Use zlib-rs for much faster zlib decoding/encoding enhancement(deps): Use zlib-rs for much faster zlib decoding/encoding Mar 5, 2025
@pront
Copy link
Member
pront commented Mar 13, 2025

Test failures: https://github.com/vectordotdev/vector/actions/runs/13676581269/job/38244029801?pr=22533

Does this indicate that this change breaks existing behavior?

@JakubOnderka
Copy link
Contributor Author
JakubOnderka commented Mar 13, 2025

flate2 offers 4 different backends. Vector now uses default miniz_oxide. So I checked how other backends decode testing bitstream [0x78, 0x9c, 0x12, 0x34, 0x56], that should be invalid:

  • miniz_oxide: error Custom { kind: Other, error: DecompressError(General { msg: None }) }
  • zlib: returns decompressed bytes[0x11, 0x33]
  • zlib-ng: returns decompressed bytes[0x11, 0x33]
  • zlib-rs: returns decompressed bytes[0x11, 0x33]
  • cloudflare_zlib: returns decompressed bytes[0x11, 0x33]

So only miniz_oxide returns error for that bitstream. For me it looks like that miniz_oxide behavior is non standard.

@jorgehermo9
Copy link
Contributor
jorgehermo9 commented Mar 13, 2025

I used 0x12 0x34 in that test because I thought it wasn't a valid compressed payload. If it is valid, maybe we should change it to an invalid payload (following the zlib rfc, maybe?)

@JakubOnderka
Copy link
Contributor Author

I used 0x12 0x34 in that test because I thought it wasn't a valid compressed payload. If it is valid, maybe we should change it to a invalid payload (following the zlib rfc, maybe?)

I changed testing sequence, so now it fails for all flate2 backends.

@jorgehermo9
Copy link
Contributor
jorgehermo9 commented Mar 16, 2025

Thank you @JakubOnderka, I opened Frommi/miniz_oxide#169 (comment) based on your findings.

@jorgehermo9
Copy link
Contributor

I think it would be interesting to trigger benchmarks for this PR @pront. Although I don't know if current benchmarks do use zlib decompression

@pront
Copy link
Member
pront commented Mar 17, 2025

Hello, we will need to fix the failure here: https://github.com/vectordotdev/vector/actions/runs/13676581269/job/38244029801?pr=22533

Also, let's add a changelog to call out this change.

@jorgehermo9
Copy link
Contributor
jorgehermo9 commented Mar 17, 2025

Hello, we will need to fix the failure here: https://github.com/vectordotdev/vector/actions/runs/13676581269/job/38244029801?pr=22533

I think @JakubOnderka fixed it in cecb9c2. I seems that miniz_oxide is decompressing the previous payload differently from other zlib backends, but seems to to be a bug in either miniz_oxide or in the others flate2 backends, I'm not sure Frommi/miniz_oxide#169 (comment) depicts more details about this

@pront
Copy link
Member
pront commented Mar 17, 2025

Hello, we will need to fix the failure here: https://github.com/vectordotdev/vector/actions/runs/13676581269/job/38244029801?pr=22533

I think @JakubOnderka fixed it in cecb9c2. I seems that miniz_oxide is decompressing the previous payload differently from other zlib backends, but seems to to be a bug in either miniz_oxide or in the others flate2 backends, I'm not sure Frommi/miniz_oxide#169 (comment) depicts more details about this

thank you for following up with them @jorgehermo9!

@jorgehermo9
Copy link
Contributor

Update: Frommi/miniz_oxide#169 (comment)

It was a bug in miniz_oxide, glad we discovered it while doing this!

Copy link
Member
@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 LGTM. There is a conflcit with origin/master though.

@JakubOnderka
Copy link
Contributor Author

👍 LGTM. There is a conflcit with origin/master though.

Rebased to current master branch.

@pront
Copy link
Member
pront commented Mar 17, 2025

Let's also add a changelog fragment since this is not a small change 👍

@JakubOnderka
Copy link
Contributor Author

Let's also add a changelog fragment since this is not a small change 👍

Changelog added.

@pront pront enabled auto-merge March 18, 2025 14:42
@JakubOnderka
Copy link
Contributor Author

Tests are still failing, I will need to check deeper where is the problem.

@JakubOnderka
Copy link
Contributor Author
JakubOnderka commented Apr 1, 2025

It looks like that this bug will be fixed with zlib-rs 0.5 that was released today, but it was still not included in flate2. So I think the best aproach will be to revert this and wait until flate2 is updated and new version is released.

@pront
Copy link
Member
pront commented Apr 1, 2025

It looks like that this bug will be fixed with zlib-rs 0.5 that was released today, but it was still not included in flate2. So I think the best aproach will be to revert this and wait until flate2 is updated and new version is released.

Agreed. Reverting: vectordotdev/vrl#1366

@JakubOnderka
Copy link
Contributor Author

@pront New flate2 was released today, so I also created new pull request for vrl: vectordotdev/vrl#1368

@pront
Copy link
Member
pront commented Apr 2, 2025

@pront New flate2 was released today, so I also created new pull request for vrl: vectordotdev/vrl#1368

Can you point the VRL dependency (in this PR) to your VRL branch? I want to run the windows tests to see if the failure went away.

@JakubOnderka
Copy link
Contributor Author

Can you point the VRL dependency (in this PR) to your VRL branch? I want to run the windows tests to see if the failure went away.

Done, just do not forget to remote last commit before merging :)

Copy link
Member
@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/ci-run-unit-windows

@pront pront mentioned this pull request Apr 7, 2025
@pront
Copy link
Member
pront commented Apr 7, 2025

/ci-run-unit-windows

Doesn't work on forks for security reasons. Check is running here:
https://github.com/vectordotdev/vector/actions/runs/14312253986

Also, this PR needs to update the licenses

@pront pront enabled auto-merge April 8, 2025 17:32
@pront
Copy link
Member
pront commented Apr 8, 2025

Attempting to merge 🤞

@pront pront added this pull request to the merge queue Apr 8, 2025
Merged via the queue into vectordotdev:master with commit cf9185f Apr 8, 2025
42 checks passed
pront added a commit to gllb/vector that referenced this pull request Apr 22, 2025
…vectordotdev#22533)

* chore: Use zlib-rs for much faster zlib decoding/encoding

* fix spellchecking

* rename

* dd-rust-license-tool write

* chore: Bump flate2 to 1.1.1 to fix linking vector on windows

* test: Change vrl to zlib-rs branch to check building on windows

* Revert "test: Change vrl to zlib-rs branch to check building on windows"

This reverts commit 446dbc3.

* cargo update -p vrl

* updated licenses

* fix bad rebase

---------

Co-authored-by: Pavlos Rontidis <pavlos.rontidis@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0