-
Notifications
You must be signed in to change notification settings - Fork 24.5k
BLAS options: OpenBLAS vs Accelerate #71712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
also, BLIS might be an option... flame/blis#492 although this BLIS seems was not much used with pytorch... |
OpenBLAS and Accelerate should have the same API, but I'm not aware of any good benchmark of one vs another. But one should be able to recompile with different BLAS frameworks |
Yes, I can confirm this. For what is worth, the default mechanism (i.e. no user preference in BLAS flag) in cmake seems to go something like this: it looks for MKL first, then BLIS, and then Accelerate, and I think finally OpenBLAS. I do understand the importance of setting MKL first (it seems to outperform everything else in this context). However, I am slightly confused about OpenBLAS vis-a-vis Accelerate (and BLIS maybe) from a few comments I gathered around here. Mainly this one: #68812 (comment). Perhaps @IvanYashchuk can weigh in? Details of cmake compilation process:
To expand: Because PyTorch would necessarily bundle both the BLAS routine with LAPACK, and it's been reported that the LAPACK routines included in Accelerate can be buggy/unreliable, it could make a difference. We could potentially try to unbundle BLAS and LAPACK here --- i.e., we can select BLAS from Accelerate while LAPACK from another provider as available. This will obviously be more work for a very niche optimization; hence, I would like to see if we can establish any firm benchmark that Accelerate can indeed be worthy here and then we can work on unbundling BLAS and LAPACK maybe. |
Yes, thanks @vadimkantorov. The default mechanism appears to favor BLIS more than Accelerate as shown above. |
Here's the FindBLAS.cmake and FindLAPACK.cmake files that PyTorch uses. There's no way to specify the BLAS variant and CMake tries to find it in a specific order (specified in FindBLAS.cmake), but it's possible to compile PyTorch with Accelerate if CMake doesn't find anything with higher priority.
|
cc @robieta |
No! There is (which is actually good, so thank you for the flexibility!) You can specify See more here: conda-forge/pytorch-cpu-feedstock#84 (comment)
Good to know the reason, thanks! |
Also, I am happy to run some benchmarks and tests if you can point me to some meaningful ones for this particular case. I already have both OpenBLAS-based and Accelerate-based PyTorch ready (and reproducible; I can also add BLIS and/or other BLAS libraries to test). I also am happy to help if there is an interest in clarifying this further :) Note: I believe this whole thing is meaningless outside of Apple Silicon Macs at the moment. MKL BLAS/LAPACK should still be used whenever available, imo, but it is not available on Apple Silicon Macs as far as I could tell. |
Great it was fixed, if I recall correctly previously it was only affecting Caffe2 and not ATen (for example #60328).
Reference LAPACK requires BLAS and if LAPACK is built from source against BLAS from Accelerate then I guess there shouldn't be any problems. |
@robieta any thoughts? |
From my limited testing, there is little value in choosing one or the other --- they do end up being rather similar. Closing this, thanks everyone for engaging with this and good luck :) |
Uh oh!
There was an error while loading. Please reload this page.
🚀 The feature, motivation and pitch
Are there any benchmarks or preference among developers here? Assuming that Intel MacOS users should use MKL; MKL isn't available for Apple Silicon --- is there any benefit from using OpenBLAS vis-a-vis Accelerate? Any documentation or benchmarks? Thanks!
Alternatives
No response
Additional context
No response
cc @VitalyFedyunin @ngimel
The text was updated successfully, but these errors were encountered: