-
Notifications
You must be signed in to change notification settings - Fork 357
Switch to distroless Base image #1154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Pull Request Test Coverage Report for Build 15753213707Details
💛 - Coveralls |
ea3b2ed
to
67e0b1c
Compare
27605cd
to
bccb7d8
Compare
Signed-off-by: Evan Lezar <elezar@nvidia.com>
6998d23
to
9b15d1d
Compare
FROM nvcr.io/nvidia/cuda:12.9.0-base-ubi9 | ||
# The application stage contains the application used as a GPU Operator | ||
# operand. | ||
FROM nvcr.io/nvidia/distroless/go:v3.1.9-dev AS application |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note, the shell included in the -dev
tags is located at /busybox/sh
. I would recommend creating a symlink at /bin/sh
so that in the operator we can use #! /bin/sh
as the shebang for the entrypoint script. By using /bin/sh
we remain backwards compatible with older toolkit images that are not built on distroless. We have tested this with other operands, e.g. https://github.com/NVIDIA/k8s-kata-manager/blob/f58e4dad0695043a545b17e3e159e24828816a62/deployments/container/Dockerfile#L50-L51
FROM nvcr.io/nvidia/distroless/go:v3.1.9-dev AS application | |
FROM nvcr.io/nvidia/distroless/go:v3.1.9-dev AS application | |
SHELL ["/busybox/sh", "-c"] | |
RUN ln -s /busybox/sh /bin/sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, should we explicitly set USER 0:0
in the Dockerfile as the default user in distroless is uid 1000? I assume the toolkit requires running as root (for restarting containerd).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the shell tip. Will update.
I'm not sure on the user preference. Does the GPU Operator not set the user in general? Would using the current user (USER 1000:1000
) not be more "compliant"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the user to 0:0
below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure on the user preference. Does the GPU Operator not set the user in general? Would using the current user (USER 1000:1000) not be more "compliant"?
The GPU Operator does not explicitly set the runAsUser
/ runAsGroup
fields when deploying Daemonsets, so we currently depend on the user / group defined in the image itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have performed a quick sanity check of the toolkit image built in this PR. Looks good.
Besides having to change the entrypoint script to be a POSIX shell script (and not bash), I also had to change https://github.com/NVIDIA/gpu-operator/blob/6324d2aca562edf46d93cbf9d2a0837ab5c12e59/assets/state-container-toolkit/0400_configmap.yaml#L34 from
exec nvidia-toolkit
to
exec nvidia-ctk-installer
I see the name of the executable has changed. This is a breaking change that will need to be made when we bump the version of the toolkit to 1.18.0 in the GPU Operator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have raised NVIDIA/gpu-operator#1496 which updates our entrypoint scripts in the operator to use sh
instead of bash
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can include an nvidia-toolkit
symlink so that we maintain backward compatibility.
Also on:
I've updated the user to 0:0 below.
I had to set the user before we create the /bin/sh
symlink since the default user doesn't have permissions to write to /bin
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the image to include a /work/nvidia-toolkit
-> /work/nvidia-ctk-installer
symlink. This should allow compatibility with the GPU Operator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
fb7573b
to
e8abb58
Compare
Signed-off-by: Evan Lezar <elezar@nvidia.com>
e8abb58
to
b94721c
Compare
This change removes the NGC-DL-CONTAINER-LICENSE (since this is not available in the distroless images) and includes the repo's Apache LICENSE file in the image. Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change ensures that a symlink from /work/nvidia-toolkit to /work/nvidia-ctk-installer exists to allow GPU Operator versions that override the entrypoint and assume nvidia-toolkit as the original entrypoint. Signed-off-by: Evan Lezar <elezar@nvidia.com>
b94721c
to
6070681
Compare
This change switches to the nvcr.io/nvidia/distroless/go:v3.1.9-dev distroless go image for both the application image and the packaging image.