Pre-compile each layer individually #492

jsturtevant · 2024-02-21T01:46:46Z

After some feedback we found some runtimes require ability to individually pre-compile layers. This attempts to address that ability.

This also fixes a bug that was introduced when adding pre-compilation where the media types were being overwritten.

kate-goldenring · 2024-02-23T18:06:18Z

@jsturtevant I have tested these changes out with a branch of the Spin containerd shim with pre-compilation support and it works great. I've tested Spin apps with multiple components and static assets, as well. To me, this is looking ready to take out of draft. Thank you for expanding the implementation!

kate-goldenring · 2024-02-23T21:36:54Z

@jsturtevant through more testing, I did find some odd behavior. After pulling an app, I can rerun the shim over and over and get expected behavior (reuses precompiled components); however, if i repull the image (ctr image pull) and try to run (ctr run) it fails to execute. It looks like after the image is repulled, runwasi is erroneously asking the spin shim to precompile again despite knowing it already was precompiled. Then the shim fails, potentially because it is trying to precompile cwasm at that point:

Feb 23 13:17:51 kagold-ThinkPad-X1-Carbon-6th containerd[496568]: time="2024-02-23T21:17:51.312229693Z" level=info msg="Shim successfully started, waiting for exit signal..."
Feb 23 13:17:51 kagold-ThinkPad-X1-Carbon-6th containerd[496568]: time="2024-02-23T21:17:51.317710774Z" level=info msg="found manifest with WASM OCI image format"
Feb 23 13:17:51 kagold-ThinkPad-X1-Carbon-6th containerd[496568]: time="2024-02-23T21:17:51.318314715Z" level=info msg="layer sha256:35dd7338635af15592327986c45e75efc2b258f1d92e4282ced7668d827e5741 has pre-compiled content: sha256:8d7374edd04e514d5c7b35a1b4a2a334716e1bcae596dfc0944e23b921bf6cc6 "
Feb 23 13:17:51 kagold-ThinkPad-X1-Carbon-6th containerd[496568]: time="2024-02-23T21:17:51.323980565Z" level=info msg="layer sha256:65bc286f8315746d1beecd2430e178f539fa487ebf6520099daae09a35dbce1d has pre-compiled content: sha256:dabc42e0d769727d23126b7908e0ea9dd4a93a7cd8867cb36615cbc990776ba8 "
Feb 23 13:17:51 kagold-ThinkPad-X1-Carbon-6th containerd[496568]: time="2024-02-23T21:17:51.335170453Z" level=info msg="precompiling layers for image: ghcr.io/kate-goldenring/spin-hello-and-kv-explorer:latest"
Feb 23 13:17:51 kagold-ThinkPad-X1-Carbon-6th containerd[496568]: time="2024-02-23T21:17:51.335195337Z" level=info msg="Precompiling layer with Spin Engine: "sha256:35dd7338635af15592327986c45e75efc2b258f1d92e4282ced7668d827e5741""
Feb 23 13:17:51 kagold-ThinkPad-X1-Carbon-6th containerd[496568]: time="2024-02-23T21:17:51.335225245Z" level=info msg="Precompiling layer "sha256:35dd7338635af15592327986c45e75efc2b258f1d92e4282ced7668d827e5741""
Feb 23 13:17:51 kagold-ThinkPad-X1-Carbon-6th containerd[496568]: time="2024-02-23T13:17:51.502392136-08:00" level=info msg="shim disconnected" id=two namespace=default
Feb 23 13:17:51 kagold-ThinkPad-X1-Carbon-6th containerd[496568]: time="2024-02-23T13:17:51.502458979-08:00" level=warning msg="cleaning up after shim disconnected" id=two namespace=default
Feb 23 13:17:51 kagold-ThinkPad-X1-Carbon-6th containerd[496568]: time="2024-02-23T13:17:51.502475509-08:00" level=info msg="cleaning up dead shim" namespace=default
Feb 23 13:17:51 kagold-ThinkPad-X1-Carbon-6th containerd[496568]: time="2024-02-23T13:17:51.507305203-08:00" level=error msg="copy shim log" error="read /proc/self/fd/13: file already closed" namespace=default

jsturtevant · 2024-02-23T22:01:00Z

@kate-goldenring did anything in the image change that caused the image sha to change?

kate-goldenring · 2024-02-23T22:25:46Z

@jsturtevant the layers of the image are the same to ctr but for some reason runwasi is seems to be getting different layers. Checking if it is already precompile in the shim is an optional workaround that gets it working again:

    fn precompile(&self, layer: &WasmLayer) -> Option<Result<Vec<u8>>> {
        log::info!(
            "Precompiling layer with Spin Engine: {:?}",
            layer.config.digest()
        );

        match layer.config.media_type() {
            MediaType::Other(name) => {
                log::info!("Precompiling layer {:?}", layer.config.digest());
                if name == "application/vnd.wasm.content.layer.v1+wasm" {
                    if let Some(_) = self.wasmtime_engine.detect_precompiled(&layer.layer) {
                        log::info!("Already precompiled");
                        return None
                    }
                    let component =
                        spin_componentize::componentize_if_necessary(&layer.layer).unwrap();
                    Some(self.wasmtime_engine.precompile_component(&component))
                } else {
                    None
                }
            }
            _ => None,
        }
    }

jsturtevant · 2024-02-23T22:27:49Z

ah, I see so we are passing the precompiled bits to the engine to recompile. Makes sense it will crash, Let me see if we can fix it up in the shim library code but being defensive here might not be a bad option.

kate-goldenring · 2024-02-23T22:30:24Z

@jsturtevant to give more context, when evaluating whether to recompile(let mut needs_precompile = can_precompile && !image.labels.contains_key(&precompile_id);), after logging, it is clear that image.labels.contains_key(&precompile_id); is evaluating to false despite the fact that the labels are correctly displayed on the image during a ctr content ls

jsturtevant · 2024-02-23T22:52:25Z

@kate-goldenring As I've been playing and thinking about the API I think a better API for this would be

fn precompile(&self, _layers: &[WasmLayer]) -> Option<Result<Vec<WasmLayer>>> {
        //do compilation as needed for each layer
    }

with the intent to evolve it eventually to make it more flexible

fn process_layers(&self, _layers: &[WasmLayer]) -> Option<Result<Vec<WasmLayer>>> {
       //do compilation and/or composing as needed
      // return layers needed
    }

This would allow for runtimes like spin to be able to compile and return layers they care about and runtimes that support components to be able to compose and compile a single layer.

Thoughts? I recognize this is a change for the tests you've been running. The change is pretty minimal on the shim implementor side.

The change to process_layers would come later and requires some other changes to make it happen. I will open an issue to track it and gather feedback.

jsturtevant · 2024-02-24T01:13:38Z

The change is pretty minimal on the shim implementor side.

6be7908

radu-matei · 2024-02-24T09:30:31Z

To @kate-goldenring's point:

@jsturtevant to give more context, when evaluating whether to recompile(let mut needs_precompile = can_precompile && !image.labels.contains_key(&precompile_id);), after logging, it is clear that image.labels.contains_key(&precompile_id); is evaluating to false despite the fact that the labels are correctly displayed on the image during a ctr content ls.

Any thouhgts why this would be happening? Nothing about the image changes in between pulls.

jsturtevant · 2024-02-26T16:40:27Z

To @kate-goldenring's point:

@jsturtevant to give more context, when evaluating whether to recompile(let mut needs_precompile = can_precompile && !image.labels.contains_key(&precompile_id);), after logging, it is clear that image.labels.contains_key(&precompile_id); is evaluating to false despite the fact that the labels are correctly displayed on the image during a ctr content ls.

Any thouhgts why this would be happening? Nothing about the image changes in between pulls.

looking into this today

kate-goldenring · 2024-02-26T17:21:55Z

The change to process_layers would come later and requires some other changes to make it happen. I will open an issue to track it and gather feedback.

@jsturtevant I like this idea of generalizing precompile to be a function that prepare_layers, process_layers, or initialize_layers -- something that denotes this only needs to happen once for an image. I also like the change to process/precompile all layers in one call. Otherwise we may have N many calls where several may have not been for media types that are Wasm. This allows the engine implementer to filter out layers that don't need to process.

jsturtevant · 2024-02-26T19:18:49Z

@jsturtevant to give more context, when evaluating whether to recompile(let mut needs_precompile = can_precompile && !image.labels.contains_key(&precompile_id);), after logging, it is clear that image.labels.contains_key(&precompile_id); is evaluating to false despite the fact that the labels are correctly displayed on the image during a ctr content ls

I am seeing the label on the image disappear when using ctr i pull. This doesn't seem to be the case when we use ctr import which is used by the tests. Working on a fix and a test to cover the scenario.

jsturtevant · 2024-02-26T23:58:56Z

I've stored the "pre-compiled" flag on the content instead of the image, since the image is mutable (I think @cpuguy83 pointed this out previously but I didn't fully understand the implications at the time).

Note that if runwasi detects that one or more of the pre-compiled components are removed from the cache it will re-request a compilation so its good to keep the check for compilation in the shim. I don't really know a valid scenario where this would happen besides a user going in and deleting it manually. I will see if we can come up with a way to handle this when we improve the api towards process_layers.

cpuguy83 · 2024-02-27T00:07:29Z

The image reference (which we read from the container object) is mutable, is what I was referring to.
But yes I think its still best to have it on the content that is being compiled rather than the image itself which is often referring to an image index and not a specific image manifest.

jsturtevant · 2024-02-28T01:35:00Z

I was looking into the why the test failed here, I was able to reproduce it locally although flaky and identified that it is due to the way the test is importing images. In the failed test the image gets imported twice, I expected this to be a no-op but the annotations in the image are stored in a hashmap which results in a different image digest causing the flake (since the flag isn't found on the content but the individual layers are the same).

This comes back to my comment in #492 (comment) which is to say that we are not passing information around if we know whether it is compiled to the shim today but can possible address it with a better API.

kate-goldenring

@jsturtevant I just have some preliminary questions on the precompile API

crates/containerd-shim-wasm/src/sandbox/containerd/client.rs

crates/containerd-shim-wasm/src/container/engine.rs

kate-goldenring · 2024-02-28T20:39:53Z

crates/containerd-shim-wasm/src/sandbox/containerd/client.rs

+                    let compiled_layers = compiled_layer_result?;
+
+                    for (i, precompiled_layer) in compiled_layers.iter().enumerate() {
+                        let original_layer = &layers[i];


This line assumes that all N layers are returned even if only M were precompiled (say a few layers were not wasm). We should find a way to support precompiling multiple layers but also support engine implementations that support non wasm layers (such as the Spin one).

For now, I want avoid handling the situation where multple layers are compiled to 1. I think this is going to need some bigger changes and would like the design to evolve it to a different api like process_layers or initialize_layers with more input for folks. If I use Result<Vec<Option<WasmLayer>>> as you suggested, I can avoid looking up the layer here and just skip if None

#504 for tracking updates

Signed-off-by: James Sturtevant <jstur@microsoft.com>

…e can change Signed-off-by: James Sturtevant <jstur@microsoft.com>

Signed-off-by: James Sturtevant <jstur@microsoft.com>

crates/containerd-shim-wasm/src/container/engine.rs

crates/containerd-shim-wasm/src/sandbox/containerd/client.rs

Signed-off-by: James Sturtevant <jstur@microsoft.com>

Co-authored-by: Kate Goldenring <kate.goldenring@gmail.com> Signed-off-by: James Sturtevant <jsturtevant@gmail.com>

kate-goldenring

Thank you @jsturtevant for all the iterations of this. This really speeds up the performance of the containerd shim being able to precompile all layers in one call and preserve those precompiled layers.

jsturtevant · 2024-03-05T23:33:46Z

Thank you @jsturtevant for all the iterations of this. This really speeds up the performance of the containerd shim being able to precompile all layers in one call and preserve those precompiled layers.

I pushed one last change based on your feedback. Thanks for trying out all the changes, the feedback and being patient as I adjusted the API based on the usage

Signed-off-by: James Sturtevant <jstur@microsoft.com>

kate-goldenring

Two doc nits

crates/containerd-shim-wasm/src/container/engine.rs

Co-authored-by: Kate Goldenring <kate.goldenring@gmail.com> Signed-off-by: James Sturtevant <jsturtevant@gmail.com>

radu-matei · 2024-03-06T13:30:10Z

I may have bumped into an issue that will turn out a blocker for the current approach — while running a Wasm component built with SpiderMonkey, the resulting precompiled component size is larger than the maximum message size in gRPC, which means saving it to the content store fails:

ERRO[2024-03-06T12:52:14.701165146+01:00] (*service).Write failed                       error="rpc error: code = ResourceExhausted desc = grpc: received message larger than max (43404572 vs. 16777216)" expected="sha256:617aa5799c43c511336f0c40708e8612dd804011eb9f713a9dbcb7340e8ff9c7" ref=precompile-runwasi.io/precompiled/spin/17767192358106208183 total=43404352
time="2024-03-06T11:52:14.703413103Z" level=warn msg="Error obtaining wasm layers for container sl.  Will attempt to use files inside container image. Error: response stream error: status: ResourceExhausted, message: "grpc: received message larger than max (43404572 vs. 16777216)", details: [], metadata: MetadataMap { headers: {} }"

Is there a way where runwasi could write the file in the store directly, then use the containerd API to add the correct annotations?

devigned · 2024-03-06T13:53:40Z

Is the message not getting chunked? If not, perhaps, that is the path we should pursue.

jsturtevant · 2024-03-06T16:19:57Z

it looks like the write failed when saving the new content. Will need to implement streaming for larger content when doing the write. Right now it writes and commits in one action:

runwasi/crates/containerd-shim-wasm/src/sandbox/containerd/client.rs

Lines 198 to 201 in 2171b3d

    
           // Write and commit at same time 
        
           let mut labels = HashMap::new(); 
        
           labels.insert(label.to_string(), original_digest.clone()); 
        
           let commit_request = WriteContentRequest {

I don't think this is a blocker on this PR and I would be inclined to merge this and make those changes separately as this changeset already has quite a bit going on.

radu-matei · 2024-03-06T18:23:50Z

Chunking sounds great, and I agree that we can do this in a follow-up.

Thanks for the quick update!

Mossaka

Great PR. Most of my comments are non-blocking refactoring tips.

Mossaka · 2024-03-06T18:02:24Z

crates/containerd-shim-wasm/src/container/engine.rs

+    /// The cached, precompiled layers will be reloaded on subsequent runs.
+    /// The runtime is expected to return the same number of layers passed in, if the layer cannot be precompiled it should return `None` for that layer.
+    /// In some edge cases it is possible that the layers may already be precompiled and None should be returned in this case.
+    fn precompile(&self, _layers: &[WasmLayer]) -> Result<Vec<Option<Vec<u8>>>> {


nit: type alias for Vec<Option<Vec<u8>>>>, perhaps PrecompiledWasmLayers?

One nit to counter the nit with: not everything returned is a precompiled layer, which is why there is an option wrapping, so it feels like a confusing name.

Or maybe Vec<Option<WasmLayer>>?

WasmLayer is already a structure. The API was that at one point but was changed to just return the layer content rather than the content and config because the original config is used as the source of truth. I think we should maybe leave it as is or PrecompiledLayer could work, though i prefer not using types to alias something as small as Vec<u8>

Mossaka · 2024-03-06T18:12:29Z

crates/containerd-shim-wasm/src/sandbox/containerd/client.rs

@@ -375,6 +347,15 @@ impl Client {
        })
    }

+    fn get_image_manifest(&self, image_name: &str) -> Result<(ImageManifest, String)> {


Suggested change

fn get_image_manifest(&self, image_name: &str) -> Result<(ImageManifest, String)> {

fn get_image_manifest_and_digest(&self, image_name: &str) -> Result<(ImageManifest, String)> {

Mossaka · 2024-03-06T18:12:55Z

crates/containerd-shim-wasm/src/sandbox/containerd/client.rs

+        let manifest = self.read_content(&image_digest)?;
+        let manifest = manifest.as_slice();
+        let manifest = ImageManifest::from_reader(manifest)?;


Suggested change

let manifest = self.read_content(&image_digest)?;

let manifest = manifest.as_slice();

let manifest = ImageManifest::from_reader(manifest)?;

let manifest = ImageManifest::from_reader(self.read_content(&image_digest)?.as_slice())?;

crates/containerd-shim-wasm/src/sandbox/containerd/client.rs

Mossaka · 2024-03-06T18:17:01Z

crates/containerd-shim-wasm/src/sandbox/containerd/client.rs

        let layers = manifest
            .layers()
            .iter()
            .filter(|x| is_wasm_layer(x.media_type(), T::supported_layers_types()))
-            .map(|config| self.read_content(config.digest()))
+            .map(|original_config| {
+                let mut digest_to_load = original_config.digest().clone();


Refactor into it's own function?

Mossaka · 2024-03-06T18:21:57Z

crates/containerd-shim-wasm/src/sandbox/containerd/client.rs

+                    let info = self.get_info(&digest_to_load)?;
+                    if info.labels.contains_key(&precompile_id) {
+                        // Safe to unwrap here since we already checked for the label's existence
+                        digest_to_load = info.labels.get(&precompile_id).unwrap().clone();


Suggested change

let info = self.get_info(&digest_to_load)?;

if info.labels.contains_key(&precompile_id) {

// Safe to unwrap here since we already checked for the label's existence

digest_to_load = info.labels.get(&precompile_id).unwrap().clone();

if let Some(precompiled_digest) = info.labels.get(&precompile_id) {

...

}

Mossaka · 2024-03-06T18:26:22Z

crates/containerd-shim-wasm/src/sandbox/containerd/client.rs

+            match engine.precompile(&layers) {
+                Ok(compiled_layers) => {
+                    if compiled_layers.len() != layers.len() {
+                        return Err(ShimError::FailedPrecondition(
+                            "precompile returned wrong number of layers".to_string(),
+                        ));
+                    }


nit: to imporve readability, consider

let compiled_layers = match engine.precompile(&layers) { Ok(compiled_layers) => { if compiled_layers.len() != layers.len() { return Err(ShimError::FailedPrecondition( "precompile returned wrong number of layers".to_string(), )); } compiled_layers } Err(e) => { log::error!("precompilation failed: {}", e); return Err(ShimError::FailedPrecondition("precompilation failed".to_string())); } };

Mossaka · 2024-03-06T18:31:04Z

crates/containerd-shim-wasm/src/sandbox/containerd/client.rs

-                }],
-                platform,
-            ));
+        if layers.is_empty() {


nit: move this condition check to right after layers is decalred.

Mossaka · 2024-03-06T18:32:28Z

crates/containerd-shim-wasm/src/sandbox/containerd/client.rs

@@ -542,4 +580,370 @@ mod tests {
            .read_content(expected)
            .expect_err("content should not exist");
    }
+
+    #[test]
+    fn test_layers_when_precompile_not_supported() {


Thanks for adding so much test cases. Love it 😍

Mossaka · 2024-03-06T18:34:58Z

crates/containerd-shim-wasm/src/sandbox/containerd/client.rs

+    }
+
+    #[test]
+    fn test_layers_are_precompiled_for_multiple_layers() {


Maybe add a case where layers are empty, or all layers contain no Wasm type?

Mossaka · 2024-03-06T18:56:06Z

Okay, going to merge this and figure out the comments later.

this is to resolve my comments made in PR containerd#492 in an effort to make the code a bit more idiomatic and readable Signed-off-by: jiaxiao zhou <jiazho@microsoft.com>

jsturtevant mentioned this pull request Feb 22, 2024

feat(spin/precompilation): add support for precompiling wasm components spinkube/containerd-shim-spin#16

Closed

jsturtevant force-pushed the runwasi-precompile-take-2 branch from 64ebc9b to 8b051f4 Compare February 26, 2024 23:51

jsturtevant force-pushed the runwasi-precompile-take-2 branch from 8b051f4 to 18f1e99 Compare February 27, 2024 00:15

jsturtevant force-pushed the runwasi-precompile-take-2 branch from 73c80cc to f654ba8 Compare February 28, 2024 19:36

jsturtevant marked this pull request as ready for review February 28, 2024 19:48

kate-goldenring reviewed Feb 28, 2024

View reviewed changes

crates/containerd-shim-wasm/src/sandbox/containerd/client.rs Outdated Show resolved Hide resolved

crates/containerd-shim-wasm/src/container/engine.rs Outdated Show resolved Hide resolved

kate-goldenring reviewed Feb 28, 2024

View reviewed changes

jsturtevant added 8 commits March 1, 2024 21:12

Pre-compile each layer individually

d8003bd

Signed-off-by: James Sturtevant <jstur@microsoft.com>

Fix tests

b1f4042

Signed-off-by: James Sturtevant <jstur@microsoft.com>

Process all layers at once

63e5192

Signed-off-by: James Sturtevant <jstur@microsoft.com>

Move compile flag to the content instead of image since it is imutabl…

3d84809

…e can change Signed-off-by: James Sturtevant <jstur@microsoft.com>

fix e2e tests

547e836

Signed-off-by: James Sturtevant <jstur@microsoft.com>

re-use image to avoid digest differences

cbd1812

Signed-off-by: James Sturtevant <jstur@microsoft.com>

Clean up clones and add docs

507cf8a

Signed-off-by: James Sturtevant <jstur@microsoft.com>

Add tests to cover scenarios

a94f3a2

Signed-off-by: James Sturtevant <jstur@microsoft.com>

Change the API

215f503

Signed-off-by: James Sturtevant <jstur@microsoft.com>

jsturtevant force-pushed the runwasi-precompile-take-2 branch from abf1f3a to 215f503 Compare March 1, 2024 21:17

kate-goldenring reviewed Mar 2, 2024

View reviewed changes

crates/containerd-shim-wasm/src/container/engine.rs Outdated Show resolved Hide resolved

kate-goldenring reviewed Mar 2, 2024

View reviewed changes

crates/containerd-shim-wasm/src/sandbox/containerd/client.rs Show resolved Hide resolved

jsturtevant and others added 2 commits March 4, 2024 23:24

Add gc flag to layer

19a1024

Signed-off-by: James Sturtevant <jstur@microsoft.com>

Update crates/containerd-shim-wasm/src/container/engine.rs

5f127ed

Co-authored-by: Kate Goldenring <kate.goldenring@gmail.com> Signed-off-by: James Sturtevant <jsturtevant@gmail.com>

kate-goldenring approved these changes Mar 5, 2024

View reviewed changes

jsturtevant requested review from cpuguy83 and Mossaka March 5, 2024 23:35

use Result<Vec<Option<Vec<u8>>>> and make labels clear

d2788c6

Signed-off-by: James Sturtevant <jstur@microsoft.com>

jsturtevant force-pushed the runwasi-precompile-take-2 branch from 03bc3d2 to d2788c6 Compare March 6, 2024 00:33

kate-goldenring reviewed Mar 6, 2024

View reviewed changes

crates/containerd-shim-wasm/src/container/engine.rs Outdated Show resolved Hide resolved

crates/containerd-shim-wasm/src/container/engine.rs Outdated Show resolved Hide resolved

jsturtevant mentioned this pull request Mar 6, 2024

Create an API for Pre-processing layers #504

Open

Apply suggestions from code review

3c1894c

Co-authored-by: Kate Goldenring <kate.goldenring@gmail.com> Signed-off-by: James Sturtevant <jsturtevant@gmail.com>

Mossaka approved these changes Mar 6, 2024

View reviewed changes

Mossaka merged commit 978ba8c into containerd:main Mar 6, 2024
43 checks passed

jsturtevant mentioned this pull request Mar 6, 2024

Fix Writing Large files to containerd content store #505

Closed

Mossaka mentioned this pull request Mar 6, 2024

refactor(precompile): make code a bit more readable #506

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pre-compile each layer individually #492

Pre-compile each layer individually #492

	fn get_image_manifest(&self, image_name: &str) -> Result<(ImageManifest, String)> {
	fn get_image_manifest_and_digest(&self, image_name: &str) -> Result<(ImageManifest, String)> {

Pre-compile each layer individually #492

Pre-compile each layer individually #492

Conversation

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment