-
Notifications
You must be signed in to change notification settings - Fork 22
Benchmarks are compiled with -O2 and debug options #407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Others should be able to say more than I will, but at least one thing is that O3 is not always beneficial (it can actually decrease performance). If SIMD and loop unrolling are already reasonably well exploited in the code, it is not clear to me what gain would be expected from O3 (did you have anything specific in mind?). |
I noticed low performance of benchmark-fgemm on unbalanced dimensions for the matrix product. If I am asking myself these questions: I can easily answer the two first questions, and will provide information tomorrow. EDIT: A few runs of OpenBLAS confirmed that the problem described above come from OpenBLAS performance and not from FFLAS performance. |
On my side, using AMD-BLIS (https://github.com/amd/blis) there also is some slowdown when going towards unbalanced dimensions, but nothing that looks too surprising to me. For example:
|
This issue simply asks for a clarification (I can not add a label like "documentation issue" myself).
In the benchmarks's Makefile, one can find:
Is there a reason to compile with
-O2
and not with-O3
?Are the
-DNDEBUG
and-UDEBUG
necessary flags? May it hinder performance?The text was updated successfully, but these errors were encountered: