8000 Support AVX2 dynamic dispatch · Issue #3335 · facebook/zstd · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Support AVX2 dynamic dispatch #3335

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
embg opened this issue Dec 8, 2022 · 4 comments
Open

Support AVX2 dynamic dispatch #3335

embg opened this issue Dec 8, 2022 · 4 comments
Assignees

Comments

@embg
Copy link
Contributor
embg commented Dec 8, 2022

We currently detect BMI2 instructions at runtime, but users can only benefit from AVX2 if they compile with -march=haswell. It would be nice to provide AVX2 support to users who are compiling with default options.

This issue is motivated specifically by this loop in ZSTD_copyCDictTableIntoCCtx() which was added as part of my short cache PR. Overall extDict compression speed at level 1 is 2-3% slower if that loop is compiled to SSE2 instructions vs AVX2 instructions.

There may be other functions which can be tagged for AVX2 dispatch in the future. I expect this issue would be closed after tagging ZSTD_copyCDictTableIntoCCtx(), and we can tag additional functions gradually.

@embg embg self-assigned this Dec 8, 2022
@embg
Copy link
Contributor Author
embg commented Dec 8, 2022

I already researched how we can safely detect AVX2 at runtime: https://stackoverflow.com/questions/72522885/are-the-xgetbv-and-cpuid-checks-sufficient-to-guarantee-avx2-support

@ms178
Copy link
ms178 commented Mar 14, 2023

@ValZapod At least on Linux, x86 feature levels have become a thing. Some distributions such as CachyOS offer x86-64-v3 compiled repositories already which is very near to -march=haswell.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@embg @ms178 and others
0