8000 `fp_test`s do not pass with `gcc 11.4.1` · Issue #10 · microsoft/FourQlib · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

fp_tests do not pass with gcc 11.4.1 #10

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
sthomas025 opened this issue Feb 14, 2024 · 0 comments
Open

fp_tests do not pass with gcc 11.4.1 #10

sthomas025 opened this issue Feb 14, 2024 · 0 comments

Comments

@sthomas025
Copy link

Am I missing something here? Below are the versions of gcc and clang. Then, the output of running

[ec2-user@ip-172-31-28-24 FourQ_64bit_and_portable]$ gcc --version
gcc (GCC) 11.4.1 20230605 (Red Hat 11.4.1-2)
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

[ec2-user@ip-172-31-28-24 FourQ_64bit_and_portable]$ clang --version
clang version 15.0.7 (Amazon Linux 15.0.7-3.amzn2023.0.1)
Target: x86_64-amazon-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

This seems to be related to the version of gcc (see above for the problematic version) when coupled with the optimization level -O2 or -O3 do it but lower levels do not. (I am able to get the tests below to pass with gcc 7.3.1 but I did not investigate what is the oldest version of gcc which fails.

[ec2-user@ip-172-31-28-24 ~]$ git clone https://github.com/microsoft/FourQlib.git
Cloning into 'FourQlib'...
remote: Enumerating objects: 405, done.
remote: Counting objects: 100% (49/49), done.
remote: Compressing objects: 100% (45/45), done.
remote: Total 405 (delta 11), reused 19 (delta 4), pack-reused 356
Receiving objects: 100% (405/405), 1.27 MiB | 18.50 MiB/s, done.
Resolving deltas: 100% (216/216), done.
[ec2-user@ip-172-31-28-24 ~]$ cd FourQlib/FourQ_64bit_and_portable/
[ec2-user@ip-172-31-28-24 FourQ_64bit_and_portable]$ make ARCH=x64 CC=gcc
gcc -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   tests/crypto_tests.c
gcc -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   eccp2.c
gcc -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   eccp2_no_endo.c
gcc -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   eccp2_core.c
gcc -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   -S -o AMD64/consts.s AMD64/consts.c
sed '/.globl/d' -i AMD64/consts.s
gcc -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   -o fp2_1271_AVX2.o AMD64/fp2_1271_AVX2.S
gcc -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   crypto_util.c
crypto_util.c: In function ‘decode’:
crypto_util.c:114:9: warning: ‘fp2neg1271’ accessing 32 bytes in a region of size 16 [-Wstringop-overflow=]
  114 |         fp2neg1271(P->x);
      |         ^~~~~~~~~~~~~~~~
crypto_util.c:114:9: note: referencing argument 1 of type ‘digit_t (*)[2]’ {aka ‘long unsigned int (*)[2]’}
In file included from crypto_util.c:9:
FourQ_internal.h:287:6: note: in a call to function ‘fp2neg1271’
  287 | void fp2neg1271(f2elm_t a);
      |      ^~~~~~~~~~
gcc -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   schnorrq.c
gcc -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   hash_to_curve.c
gcc -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   kex.c
gcc -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   ../sha512/sha512.c
gcc -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   ../random/random.c
gcc -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   tests/test_extras.c
gcc -o crypto_test crypto_tests.o eccp2.o eccp2_no_endo.o eccp2_core.o fp2_1271_AVX2.o crypto_util.o schnorrq.o hash_to_curve.o kex.o sha512.o random.o  test_extras.o  
gcc -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   tests/ecc_tests.c
gcc -o ecc_test ecc_tests.o eccp2.o eccp2_no_endo.o eccp2_core.o fp2_1271_AVX2.o crypto_util.o schnorrq.o hash_to_curve.o kex.o sha512.o random.o  test_extras.o  
gcc -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   tests/fp_tests.c
gcc -o fp_test fp_tests.o eccp2.o eccp2_no_endo.o eccp2_core.o fp2_1271_AVX2.o crypto_util.o schnorrq.o hash_to_curve.o kex.o sha512.o random.o  test_extras.o  
[ec2-user@ip-172-31-28-24 FourQ_64bit_and_portable]$ ./fp_test 

--------------------------------------------------------------------------------------------------------

Testing quadratic extension field arithmetic over GF((2^127-1)^2): 

  GF(p^2) multiplication tests .................................................................... PASSED
  GF(p^2) squaring tests........................................................................... PASSED
  GF(p^2) inversion tests... FAILED
[ec2-user@ip-172-31-28-24 FourQ_64bit_and_portable]$ make clean
rm -rf libFourQ.so crypto_test ecc_test fp_test *.o AMD64/consts.s
[ec2-user@ip-172-31-28-24 FourQ_64bit_and_portable]$ make ARCH=x64 CC=clang
clang -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   tests/crypto_tests.c
clang -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   eccp2.c
clang -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   eccp2_no_endo.c
clang -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   eccp2_core.c
clang -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   -S -o AMD64/consts.s AMD64/consts.c
sed '/.globl/d' -i AMD64/consts.s
clang -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   -o fp2_1271_AVX2.o AMD64/fp2_1271_AVX2.S
clang -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   crypto_util.c
clang -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   schnorrq.c
clang -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   hash_to_curve.c
clang -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   kex.c
clang -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   ../sha512/sha512.c
clang -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   ../random/random.c
clang -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   tests/test_extras.c
clang -o crypto_test crypto_tests.o eccp2.o eccp2_no_endo.o eccp2_core.o fp2_1271_AVX2.o crypto_util.o schnorrq.o hash_to_curve.o kex.o sha512.o random.o  test_extras.o  
clang -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   tests/ecc_tests.c
clang -o ecc_test ecc_tests.o eccp2.o eccp2_no_endo.o eccp2_core.o fp2_1271_AVX2.o crypto_util.o schnorrq.o hash_to_curve.o kex.o sha512.o random.o  test_extras.o  
clang -c -O3      -fwrapv -fomit-frame-pointer -march=native -mavx2 -D _AMD64_ -D __LINUX__ -D _AVX_ -D _AVX2_ -D _ASM_  -D USE_ENDO   tests/fp_tests.c
clang -o fp_test fp_tests.o eccp2.o eccp2_no_endo.o eccp2_core.o fp2_1271_AVX2.o crypto_util.o schnorrq.o hash_to_curve.o kex.o sha512.o random.o  test_extras.o  
[ec2-user@ip-172-31-28-24 FourQ_64bit_and_portable]$ ./fp_test 

--------------------------------------------------------------------------------------------------------

Testing quadratic extension field arithmetic over GF((2^127-1)^2): 

  GF(p^2) multiplication tests .................................................................... PASSED
  GF(p^2) squaring tests........................................................................... PASSED
  GF(p^2) inversion tests.......................................................................... PASSED
  Modular addition tests .......................................................................... PASSED
  Montgomery multiplication and conversion tests .................................................. PASSED
  Montgomery inversion tests....................................................................... PASSED

--------------------------------------------------------------------------------------------------------

Benchmarking quadratic extension field arithmetic over GF((2^127-1)^2): 

  GF(p^2) addition runs in ...............        7 cycles
  GF(p^2) subtraction runs in ............        7 cycles
  GF(p^2) squaring runs in ...............       20 cycles
  GF(p^2) multiplication runs in .........       29 cycles
  GF(p^2) inversion runs in ..............     2026 cycles
  Addition modulo the order runs in ......       19 cycles
  Subtraction modulo the order runs in ...       18 cycles
  Montgomery multiply mod order runs in ..      152 cycles
  Montgomery inversion mod order runs in .    50298 cycles
[ec2-user@ip-172-31-28-24 FourQ_64bit_and_portable]$ 
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant
0