-
Notifications
You must be signed in to change notification settings - Fork 74
Fix teeserver context reset issue & add container signature cache #397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -4,7 +4,7 @@ substitutions: | |||
'_CLEANUP': 'true' | |||
'_VM_NAME_PREFIX': 'discover-signatures' | |||
'_ZONE': 'us-west1-a' | |||
'_WORKLOAD_IMAGE': 'us-west1-docker.pkg.dev/confidential-space-images-dev/cs-integ-test-images/basic-test:latest' | |||
'_WORKLOAD_IMAGE': 'us-west1-docker.pkg.dev/confidential-space-images-dev/cs-integ-test-images/ipc/happypath:latest' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also have a test for the default token to ensure signed container works with it as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I created a separate test for ODA and signed container, and keep this test for the default token.
73390b7
to
ec9ef03
Compare
5b07759
to
99ff543
Compare
- name: 'gcr.io/cloud-builders/gcloud' | ||
id: BasicDiscoverSignaturesTest | ||
entrypoint: 'bash' | ||
args: ['scripts/test_launcher_workload_discover_signatures.sh', '${_VM_NAME_PREFIX}-${BUILD_ID}', '${_ZONE}', '2'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we expect 2 signatures? I only see one cosign call above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is to check how many times Found container image signatures
is being logged. Fetching signatures will occur on refresh the default token, and on teeserver receiving ODA requests, so the expected number is 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case better comments in the test and/or in launcher/image/test/scripts/test_launcher_workload_discover_signatures.sh
would be good. When reading it, I assumed it was trying to find the number of signatures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should compare the signatures fetched are the same somehow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
bf31121
to
8e4f201
Compare
launcher/agent/agent_test.go
Outdated
tc := tc | ||
t.Run(tc.name, func(t *testing.T) { | ||
// Read cache from a separate goroutine. | ||
t.Parallel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we also read the cache before setting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments below.
launcher/agent/agent_test.go
Outdated
@@ -255,3 +253,58 @@ func TestFetchContainerImageSignatures(t *testing.T) { | |||
}) | |||
} | |||
} | |||
|
|||
func TestCache(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow why we need this test. I think I would ideally want to a few things for this change: synchronization on the mutex (throwing goroutines and trying to catch a race condition), making sure Refresh
updates appropriately when the backing store changes (e.g., add sig, delete sig), and finally refreshToken
does indeed update the default token and the cache (unit test and perhaps a manual test).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
- Added unit tests
TestCacheConcurrentSet
andTestCacheConcurrentGet
that spawn goroutines for synchronization checks. - Added a unit test
TestRefreshTokenWithSignedContainerCacheEnabled
incontainer_runner_test.go
to checkrefreshToken
updates the default token and cache appropriately when the backing store changes (signatures get updated.)
- name: 'gcr.io/cloud-builders/gcloud' | ||
id: BasicDiscoverSignaturesTest | ||
entrypoint: 'bash' | ||
args: ['scripts/test_launcher_workload_discover_signatures.sh', '${_VM_NAME_PREFIX}-${BUILD_ID}', '${_ZONE}', '2'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should compare the signatures fetched are the same somehow
8027f4e
to
d442c4c
Compare
launcher/agent/agent_test.go
Outdated
func TestCacheConcurrentGet(t *testing.T) { | ||
ctx := context.Background() | ||
|
||
cache := &sigsCache{} | ||
if sigs := cache.get(); len(sigs) != 0 { | ||
t.Errorf("signature cache should be empty, but got: %v", sigs) | ||
} | ||
|
||
sdClient := signaturediscovery.NewFakeClient() | ||
targetRepos := []string{signaturediscovery.FakeRepoWithSignatures} | ||
original := fetchContainerImageSignatures(ctx, sdClient, targetRepos, log.Default()) | ||
cache.set(original) | ||
|
||
wantBase64Sigs := convertOCISignatureToBase64(t, original) | ||
|
||
group, _ := errgroup.WithContext(ctx) | ||
for i := 0; i < runtime.NumCPU(); i++ { | ||
group.Go(func() error { | ||
cached := cache.get() | ||
gotBase64Sigs := convertOCISignatureToBase64(t, cached) | ||
if !cmp.Equal(wantBase64Sigs, gotBase64Sigs) { | ||
return fmt.Errorf("cached base64 sigs not equal to original base64 sigs, got %v, but want %v", gotBase64Sigs, wantBase64Sigs) | ||
} | ||
return nil | ||
}) | ||
} | ||
|
||
if err := group.Wait(); err != nil { | ||
t.Errorf("cache concurrent get failed: %v", err) | ||
} | ||
} | ||
|
||
func TestCacheConcurrentSet(t *testing.T) { | ||
ctx := context.Background() | ||
|
||
cache := &sigsCache{} | ||
if sigs := cache.get(); len(sigs) != 0 { | ||
t.Errorf("signature cache should be empty, but got: %v", sigs) | ||
} | ||
|
||
sdClient := signaturediscovery.NewFakeClient() | ||
targetRepos := []string{signaturediscovery.FakeRepoWithSignatures} | ||
|
||
var wg sync.WaitGroup | ||
for i := 0; i < runtime.NumCPU(); i++ { | ||
wg.Add(1) | ||
go func() { | ||
defer wg.Done() | ||
sigs := fetchContainerImageSignatures(ctx, sdClient, targetRepos, log.Default()) | ||
cache.set(sigs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we aren't intermingling sets and gets in a concurrent manner, we aren't actually testing the sync primitive. Consider intermingling data accesses and using the -race
flag: https://go.dev/doc/articles/race_detector
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
logger: log.Default(), | ||
} | ||
|
||
if err := os.MkdirAll(launcherfile.HostTmpPath, 0744); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should use https://pkg.go.dev/testing#B.TempDir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since refreshToken
will always write OIDC token to a fixed location, I think we'll have to create a fixed directory here instead of a random temp test directory.
launcher/container_runner_test.go
Outdated
t.Fatalf("refreshToken returned with error: %v", err) | ||
} | ||
|
||
// simulate adding signatures. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: here and below capitalize comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
'_CLEANUP': 'true' | ||
'_VM_NAME_PREFIX': 'oda-signedcontainer' | ||
'_ZONE': 'us-east1-b' | ||
'_WORKLOAD_IMAGE': 'us-west1-docker.pkg.dev/confidential-space-images-dev/cs-integ-test-images/ipc/happypath@sha256:999831a7b8f8afd323e2359f3c1192206be2aa1d4f3b19f0739eff5f01f83b9e' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment here that if the workload image changes, the commit author should change the cosign signature as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
0688008
to
fe1d199
Compare
fe1d199
to
235f223
Compare
/gcbrun |
New Features: [launcher] Add TEE server IPC implementation google#367 [launcher] Enable memory monitoring in CS google#391 Use TDX quote provider to attest and verify google#405 Integrate nonce verification as part of the TDX quote validation procedure. google#395 Add RISC V support google#407 [launcher] Use resizable integrity-fs with in-memory tags google#412 Bug Fixes: [launcher] Fix launcher exit code google#384 [launcher] Handle exit code checking during deferral evaluation google#392 [cmd] Skip tests that call setGCEAKTemplate google#402 [launcher] Fix teeserver context reset issue & add container signature cache google#397 Set all unused parameters as _ to fix CI lint failure google#411 [launcher] Make customtoken test sleep to mitigate clock skew google#413 Other Changes: Add eventlog parse logics for memory monitoring google#404 [launcher]: Add memory monitor measurement logics google#408 Update go-tdx-guest version to v0.3.1 google#414 New Contributors: @KeithMoyer in google#392 @vbalain in google#405 @aimixsaka in google#407
New Features: [launcher] Add TEE server IPC implementation #367 [launcher] Enable memory monitoring in CS #391 Use TDX quote provider to attest and verify #405 Integrate nonce verification as part of the TDX quote validation procedure. #395 Add RISC V support #407 [launcher] Use resizable integrity-fs with in-memory tags #412 Bug Fixes: [launcher] Fix launcher exit code #384 [launcher] Handle exit code checking during deferral evaluation #392 [cmd] Skip tests that call setGCEAKTemplate #402 [launcher] Fix teeserver context reset issue & add container signature cache #397 Set all unused parameters as _ to fix CI lint failure #411 [launcher] Make customtoken test sleep to mitigate clock skew #413 Other Changes: Add eventlog parse logics for memory monitoring #404 [launcher]: Add memory monitor measurement logics #408 Update go-tdx-guest version to v0.3.1 #414 New Contributors: @KeithMoyer in #392 @vbalain in #405 @aimixsaka in #407
New Features: [launcher] Add TEE server IPC implementation google#367 [launcher] Enable memory monitoring in CS google#391 Use TDX quote provider to attest and verify google#405 Integrate nonce verification as part of the TDX quote validation procedure. google#395 Add RISC V support google#407 [launcher] Use resizable integrity-fs with in-memory tags google#412 Bug Fixes: [launcher] Fix launcher exit code google#384 [launcher] Handle exit code checking during deferral evaluation google#392 [cmd] Skip tests that call setGCEAKTemplate google#402 [launcher] Fix teeserver context reset issue & add container signature cache google#397 Set all unused parameters as _ to fix CI lint failure google#411 [launcher] Make customtoken test sleep to mitigate clock skew google#413 Other Changes: Add eventlog parse logics for memory monitoring google#404 [launcher]: Add memory monitor measurement logics google#408 Update go-tdx-guest version to v0.3.1 google#414 New Contributors: @KeithMoyer in google#392 @vbalain in google#405 @aimixsaka in google#407
Discussed offline, we decide to introduce container signatures cache to attestation agent.
So we should refresh the signatures cache every time the token refresher goroutine runs (~1/hr), and ODA will only read from the cache.
This PR also includes the following changes:
EnableSignedContainerImage
, add a new flagEnableSignedContainerCache