8000 Add Valgrind-based detection of variable-latency instructions by mkannwischer · Pull Request #693 · pq-code-package/mlkem-native · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add Valgrind-based detection of variable-latency instructions #693

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 24, 2025

Conversation

mkannwischer
Copy link
Contributor
@mkannwischer mkannwischer commented Jan 24, 2025

This commit builds on top of
#687
which added Valgrind-based constant-time tests detecting
secret-dependent memory accesses and conditional branches.

This commit extends this to also detect secret-dependent
divisions such as those that lead to the KyberSlash
vulnerability.
See https://kyberslash.cr.yp.to/.
We test for that by using the patch to Valgrind proposed
in the KyberSlash paper (see same link).
It adds an option --variable-latency-errors=no|yes to valgrind
which defaults to no. If you set it to yes, it will complain if
you use undefined in division instructions.

This patch has also been sent to the valgrind developers, but was
so far not merged. Luckily our nix setup makes it very easy to
add a patched valgrind and cache it in the Github cache.

Initially this detected some divisions in the scalar_decompress_x
functions (divisions by powers of two) when compiling with -O0.
Luckily these functions only operate on public ciphertexts
(the ct that comes out of encapsulation, not the ct that that
comes out of re-encryption).
I verified this is the case by explicitly declassifying
the ciphertext that comes out of encapsulation which confirmed
that there are no further divisions.

While this is fine (this ciphertext is public!), it's rather
unnecessary as we really don't need divisions there.
I instead opted to replace those / divisions in the code
by shifts (which is what the compilers do with anything
from -O1 anyway).

The nix setup may be of independent interest as it makes it very easy
to use the patched valgrind (and also a number of different compilers).
You can simply run

nix develop .#ci_valgrind-varlat_gcc14

and you'll magically end up in a shell with a patched valgrind
and gcc14.
Currently we have

ci_valgrind-varlat_clang14
ci_valgrind-varlat_clang15
ci_valgrind-varlat_clang16
ci_valgrind-varlat_clang17
ci_valgrind-varlat_clang18
ci_valgrind-varlat_clang19
ci_valgrind-varlat_gcc48
ci_valgrind-varlat_gcc49
ci_valgrind-varlat_gcc7
ci_valgrind-varlat_gcc11
ci_valgrind-varlat_gcc12
ci_valgrind-varlat_gcc13
ci_valgrind-varlat_gcc14

This commit also adds new workflows to the CI using those shells
and testing our library with the following flags:
{-Oz,-Os,-O3,-Ofast,-O3 -ffastmath,-O2,-O1,-O0}.

@mkannwischer mkannwischer changed the title Add Valgrind-based detection of variable-latency instruction Add Valgrind-based detection of variable-latency instructions Jan 24, 2025
@mkannwischer mkannwischer force-pushed the kyberslash branch 7 times, most recently from b325192 to bfd3e2a Compare January 24, 2025 08:17
@mkannwischer
Copy link
Contributor Author

I temporarily added one of the KyberSlash2 vulnerabilities in bfd3e2a to test this this is actually working as intended.
The test run can be seen here: https://github.com/pq-code-package/mlkem-native/actions/runs/12945822957/job/36109126856?pr=693

It looks something like this:

ERROR > Functional Test    ML-KEM-1024 (native opt):     ==24307== Memcheck, a memory error detector
  ==24307== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
  ==24307== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
  ==24307== Command: test/build/mlkem1024/bin/test_mlkem1024
  ==24307== 
  ==24307== Variable-latency instruction operand of size 8 is secret/uninitialised
  ==24307==    at 0x40AD2C: PQCP_MLKEM_NATIVE_MLKEM1024_poly_compress_d11 (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem1024/bin/test_mlkem1024)
  ==24307==    by 0x402194: PQCP_MLKEM_NATIVE_MLKEM1024_polyvec_compress_du (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem1024/bin/test_mlkem1024)
  ==24307==    by 0x40B659: PQCP_MLKEM_NATIVE_MLKEM1024_indcpa_enc (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem1024/bin/test_mlkem1024)
  ==24307==    by 0x401EC9: PQCP_MLKEM_NATIVE_MLKEM1024_enc_derand (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem1024/bin/test_mlkem1024)
  ==24307==    by 0x401F2F: PQCP_MLKEM_NATIVE_MLKEM1024_enc (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem1024/bin/test_mlkem1024)
  ==24307==    by 0x401284: test_keys (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem1024/bin/test_mlkem1024)
  ==24307==    by 0x40106D: main (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem1024/bin/test_mlkem1024)
  ==24307==  Uninitialised value was created by a client request
  ==24307==    at 0x401D3D: randombytes (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem1024/bin/test_mlkem1024)
  ==24307==    by 0x401F1C: PQCP_MLKEM_NATIVE_MLKEM1024_enc (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem1024/bin/test_mlkem1024)
  ==24307==    by 0x401284: test_keys (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem1024/bin/test_mlkem1024)
  ==24307==    by 0x40106D: main (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem1024/bin/test_mlkem1024)
  ==24307== 
  ==24307== Variable-latency instruction operand of size 8 is secret/uninitialised
  ==24307==    at 0x40AD2C: PQCP_MLKEM_NATIVE_MLKEM1024_poly_compress_d11 (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem1024/bin/test_mlkem1024)
  ==24307==    by 0x402194: PQCP_MLKEM_NATIVE_MLKEM1024_polyvec_compress_du (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem1024/bin/test_mlkem1024)
  ==24307==    by 0x40B659: PQCP_MLKEM_NATIVE_MLKEM1024_indcpa_enc (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem1024/bin/test_mlkem1024)
  ==24307==    by 0x4020B1: PQCP_MLKEM_NATIVE_MLKEM1024_dec (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem1024/bin/test_mlkem1024)
  ==24307==    by 0x4012[92](https://github.com/pq-code-package/mlkem-native/actions/runs/12945822957/job/36109126856?pr=693#step:5:97): test_keys (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem1024/bin/test_mlkem1024)
  ==24307==    by 0x40106D: main (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem1024/bin/test_mlkem1024)
  ==24307==  Uninitialised value was created by a client request
  ==24307==    at 0x401D3D: randombytes (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem1024/bin/test_mlkem1024)
  ==24307==    by 0x401F1C: PQCP_MLKEM_NATIVE_MLKEM1024_enc (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem1024/bin/test_mlkem1024)
  ==24307==    by 0x401284: test_keys (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem1024/bin/test_mlkem1024)
  ==24307==    by 0x40106D: main (in /home/runner/work/mlkem-native/mlkem-native/test/build/mlkem1024/bin/test_mlkem1024)
  ==24307== 
  ==24307== 
  ==24307== HEAP SUMMARY:
  ==24307==     in use at exit: 0 bytes in 0 blocks
  ==24307==   total heap usage: 1 allocs, 1 frees, 4,0[96](https://github.com/pq-code-package/mlkem-native/actions/runs/12945822957/job/36109126856?pr=693#step:5:101) bytes allocated
  ==24307== 
  ==24307== All heap blocks were freed -- no leaks are possible
  ==24307== 
  ==24307== For lists of detected and suppressed errors, rerun with: -s
  ==24307== ERROR SUMMARY: 409600 errors from 2 contexts (suppressed: 0 from 0)
  make: *** [Makefile:56: run_func_1024] Error 1

Interestingly this triggers on all versions of gcc (-Os) and clang (-Oz). The former was known and is what we used in the KyberSlash paper, the latter was not known to me and we didn't mention it in the KyberSlash paper.

hanno-becker and others added 2 commits January 24, 2025 16:47
The scalar decompression routines so far used a division
by {16,32,1024,2048}. This is not a security concern since
even when compiled into variable-time `div` instructions
as these functions are only operating on public ciphertexts
(the one that comes out of encapsulation, not the one that comes
out of re-encryption).
Yet, it appears good style -- and it avoids having to declassify
the ciphertext in constant-time tests -- to avoid the division
altogether and use a shift instead. Note that since the operands
are unsigned, there is no difference between shift and division.

Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
This commit builds on top of
#687
which added Valgrind-based constant-time tests detecting
secret-dependent memory accesses and conditional branches.

This commit extends this to also detect secret-dependent
divisions such as those that lead to the KyberSlash
vulnerability.
See https://kyberslash.cr.yp.to/.
We test for that by using the patch to Valgrind proposed
in the KyberSlash paper (see same link).
It adds an option --variable-latency-errors=no|yes to valgrind
which defaults to no. If you set it to yes, it will complain if
you use undefined values in division instructions.

This patch has also been sent to the valgrind developers, but was
so far not merged. Luckily our nix setup makes it very easy to
add a patched valgrind and cache it in the Github cache.

Initially this detected some divisions in the scalar_decompress_x
functions (divisions by powers of two) when compiling with -O0.
Luckily these functions only operate on public ciphertexts
(the ct that comes out of encapsulation, not the ct that that
comes out of re-encryption).
I verified this is the case by explicitly declassifying
the ciphertext that comes out of encapsulation which confirmed
that there are no further divisions.

While this is fine (this ciphertext is public!), it's rather
unnecessary as we really don't need divisions there.
I instead opted to replace those / divisions in the code
by shifts (which is what the compilers do with anything
from -O1 anyway).

The nix setup may be of independent interest as it makes it very easy
to use the patched valgrind (and also a number of different compilers).
You can simply run

nix develop .#ci_valgrind-varlat_gcc14

and you'll magically end up in a shell with a patched valgrind
and gcc14.
Currently we have

ci_valgrind-varlat_clang14
ci_valgrind-varlat_clang15
ci_valgrind-varlat_clang16
ci_valgrind-varlat_clang17
ci_valgrind-varlat_clang18
ci_valgrind-varlat_clang19
ci_valgrind-varlat_gcc48
ci_valgrind-varlat_gcc49
ci_valgrind-varlat_gcc7
ci_valgrind-varlat_gcc11
ci_valgrind-varlat_gcc12
ci_valgrind-varlat_gcc13
ci_valgrind-varlat_gcc14

This commit also adds new workflows to the CI using those shells
and testing our library with the following flags:
{-Oz,-Os,-O3,-Ofast,-O3 -ffastmath,-O2,-O1,-O0}.

Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
@mkannwischer mkannwischer marked this pull request as ready for review January 24, 2025 08:55
@mkannwischer mkannwischer requested a review from a team January 24, 2025 08:55
Copy link
Contributor
@hanno-becker hanno-becker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thanks a lot @mkannwischer.

Do we need new CI tests or can we just amend the existing ones?

Also, do you want to remark the existence of those CT tests in the README? Might be worth it.

Copy link
Contributor
@hanno-becker hanno-becker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this in. nix cleanup and/or README additions can come separately

@hanno-becker hanno-becker merged commit ae2cfa1 into main Jan 24, 2025
187 checks passed
@hanno-becker hanno-becker deleted the kyberslash branch January 24, 2025 11:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment