8000 Multi stage build leaves "<none>" images behind · Issue #34151 · moby/moby · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Multi stage build leaves "<none>" images behind #34151

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lukastosic opened this issue Jul 18, 2017 · 41 comments
Closed

Multi stage build leaves "<none>" images behind #34151

lukastosic opened this issue Jul 18, 2017 · 41 comments
Labels

Comments

@lukastosic
Copy link

Description

When performing multi stage builds, last image is properly created, but also intermediate images are left behind as images.

We have project with JAVA Maven module (image) and standard PHP+JS website as separate image. Both of these are generating images when in multistage build

Reproduce with JAVA/Maven

Have JAVA/Maven application and do the same steps as described in multi stage build blog post:

This is dockerfile - 1 build step and 1 final image

FROM maven:latest AS buildstep
WORKDIR /usr/src/rcla-backend
COPY pom.xml .
RUN mvn -B -f pom.xml dependency:resolve
COPY . .
RUN mvn -B package -DskipTests

FROM java:8-jre-alpine
WORKDIR /rcla-backend
COPY --from=buildstep /usr/src/rcla-backend/target/*.jar rcla-backend.jar
ENTRYPOINT ["java", "-jar", "/rcla-backend/rcla-backend.jar"]
CMD ["--spring.profiles.active=dev"]

Execute command: docker build -t testjava .

Result from JAVA/Maven

After build is complete you will see that there is final image testjava but also

image

Reproduce with simple website image

In the example below, we have source code of the website that has:

  • PHP files are in root
  • JS files are in subfolder ./js
  • PNG files are in subfolder ./img

Dockerfile that is created is just a "mockup" of multistage build -> it has "useless" 3 build steps and final step that is actually meaningful

Dockerfile

FROM eboraas/apache-php AS buildstep1
RUN apt-get update && apt-get -y install php5-curl
ADD ./js/*.js /var/www/html/

FROM eboraas/apache-php AS buildstep2
RUN apt-get update && apt-get -y install php5-curl
ADD *.php /var/www/html/

FROM eboraas/apache-php AS buildstep3
RUN apt-get update && apt-get -y install php5-curl
ADD ./img/*.png /var/www/html/

FROM eboraas/apache-php
RUN apt-get update && apt-get -y install php5-curl
ADD * /var/www/html/

Result of website image

Result of this is one final image and 3 images

image

Expected result:

Only final image is created

Additional information you deem important (e.g. issue happens only occasionally):

After docker system prune those images are gone (as expected)

Output of docker version:

Client:
 Version:      17.06.0-ce
 API version:  1.30
 Go version:   go1.8.3
 Git commit:   02c1d87
 Built:        Fri Jun 23 21:20:36 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.06.0-ce
 API version:  1.30 (minimum version 1.12)
 Go version:   go1.8.3
 Git commit:   02c1d87
 Built:        Fri Jun 23 21:21:56 2017
 OS/Arch:      linux/amd64
 Experimental: false

Output of docker info:

Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 15
Server Version: 17.06.0-ce
Storage Driver: overlay
 Backing Filesystem: xfs
 Supports d_type: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: cfb82a876ecc11b5ca0977d1733adbe58599088a
runc version: 2d41c047c83e09a6d61d464906feb2a2f3c52aa4
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-514.26.1.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.702GiB
Name: NLLD1-Luka.infodation.local
ID: LFGK:MH6I:6KWR:I3CF:XRIA:K6IC:3ANM:T6G7:EUJM:6S5T:XR2L:B3YA
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 10.4.1.218:5000
 10.4.1.188:8082
 10.4.1.188:8083
 127.0.0.0/8
Live Restore Enabled: false

WARNING: overlay: the backing xfs filesystem is formatted without d_type support, which leads to incorrect behavior.
         Reformat the filesystem with ftype=1 to enable d_type support.
         Running without d_type support will not be supported in future releases.

Additional environment details (AWS, VirtualBox, physical, etc.):

VMWare virtual machine (internal environment) running CentOS 7

@thaJeztah
Copy link
Member

This is not a bug; when using multi-stage builds, each stage produces an image. That image is stored in the local image cache, and will be used on subsequent builds (as part of the caching mechanism). You can run each build-stage (and/or tag the stage, if desired).

In addition, if you want to tag an intermediate stage as final image, set the --target option when running docker build (see Specifying target build stage (–target))

If these intermediate images would be purged/pruned automatically, the build cache would be gone with each build, therefore forcing you to rebuild the entire image each time.

I'm closing this issue because this is not a bug, but feel free to continue the conversation or if more information is needed.

@wrfly
Copy link
Contributor
wrfly commented Aug 9, 2017

Do you think its a good idea that give a switch to us that we can decide whether leave these intermediate images?
In some builders which run multi stage builds only, these intermediate images are not used to run but have to discard for some security reasons.

@thaJeztah

@lukastosic
Copy link
Author

@thaJeztah
thanks for the reply ... as @wrfly says, that would be an interesting feature

@davivcgarcia
Copy link

@lukastosic @wrfly
I agree that this behavior should be controlled by the user. For now, I'm using docker image prune -f after my docker build -t app . command to cleanup those intermediate images.

@albertorm95
Copy link

So there is no way to use prune and do not delete those intermediate images after a docker build of a Multi-Stage Dockerfile?

@thaJeztah
Copy link
Member

@albertom95 can you elaborate? Do you mean: run docker image prune afterwards, and have it remove all untagged, unused images unless those were the result of a multi-stage build?

@albertorm95
Copy link
albertorm95 commented May 11, 2018

@thaJeztah Yes, that prune:

remove all untagged, unused images unless those were the result of a multi-stage build

Cause I have a script that runs every 1h or so, and that script doesdocker image prune but this will remove the images that were needed for create the multi-stage build, except the final image that has a name:tag

@satchm0h
Copy link
satchm0h commented May 22, 2018

I'd love to see the switch added to auto-clean the intermediate build artifacts. We've got a bunch of automated builds that occur (always from source for unimportant reasons). We use multi-stage to build from source and then create a much smaller runtime image. The build image is huge in comparison to the runtime. So my top level build scripts always run this command after my docker build commands:

docker rmi $(docker images -q -f dangling=true)

Cheers!

@lukastosic
Copy link
Author

@satchm0h @thaJeztah @albertorm95

After having several node (Angular) projects I fully appreciate keeping intermediate images :)

For example we do multi-stage build for angular like this:

# Just do NPM INSTALL
FROM node:6-alpine AS dependencies
WORKDIR /code
COPY package.json .
RUN npm install 
 
# Compile Angular
FROM dependencies AS build
COPY . .
RUN yarn webpack:prod
 
# Prepare final image
FROM nginx:alpine AS release
COPY --from=build /code/target/www /usr/share/nginx/html
COPY ./nginx/site.conf /etc/nginx/conf.d/default.conf

Resulting final nginx image with website is ~38MB, but intermediate image is ~800MB

Now the intermediate image "saves the day" because usually for long stretches (2-3 sprints) we don't update package.json so all node modules are the same. Here intermediate image kicks int and it doesn't repeat npm install every time. (Even when we use Nexus as repository manager, still it takes long time to finish it)

It will repeat this intermediate image only when we change package.json but as said, it happens usually after 2-3 sprints, many builds in that time finish quicker.

So, very useful indeed, but very annoying because we can't control when we want it and when we don't. Also after several sprints we will have several "old" intermediate images that will never be used again, and then we either remove them manually, or we do docker system prune -f (or docker image prune, we are fine with system prune because we don't use those docker hosts for anything else except for builds)

@eugene-bright
Copy link

@thaJeztah, thanks for hinting --target + --tag combination.
It's useful workaround but it makes user to run docker build several times with different value of mentioned parameters to give a name to every stage image.
Subsequent runs are fast due to layer caching but this is suboptimal scenario, though.

I'm thinking about new docker build key that enables automatic image naming based on aliases provided with FROM .. AS .. instruction.
Would you mind?

@hoto
Copy link
hoto commented Oct 17, 2018

Hi.

The problem with having intermediate build images as dangling images is that whenever someone runs e.g. docker system prune to clean up dangling images (our CI runs that command every 1h) the cached layers are gone.
And because cached layers are gone the next build will take more time.
Also seeing multiple dangling images coming seemingly out of nowhere is very strange, there is no hint from where those images come from.

If would be nice to be able to tell docker to either REMOVE the intermediate images or TAG them.
As @eugene-bright suggested the tag names for the intermediate images could come from the aliases used in the Dockerfile.
IMHO the default behaviour could be to tag all intermediate images.
Additionally a flag could be passed to docker to remove intermediate images automatically after the build is done OR to make them dangling images.
Having dangling images can still be useful (for saving build time) in specific cases e.g. on CI servers where someone wants to have caching for limited time until a cleaning job comes and removes dangling images.

Let's say we have a following Dockerfile:

FROM docker.io/node:8.12.0 AS build
WORKDIR /usr/local/src
COPY package.json .
COPY yarn.lock .
RUN yarn install --production --frozen-lockfile
COPY . .

FROM docker.io/node:8.12.0-alpine
WORKDIR /usr/local/src
COPY --from=build /usr/local/src .
CMD node src/app.js

We could use different flags provided to docker with following results:

  1. By default: tag the intermediate image with an alias from Dockerfile (in this case the build alias):
$ docker build --tag user/app:1.0.0 .
Successfully built d069a7d27039 (intermediate)
Successfully tagged user/app:build
Successfully built 5e5c3d0246ae
Successfully tagged user/app:1.0.0

$ docker images
REPOSITORY  TAG    IMAGE ID     CREATED        SIZE
user/app    1.0.0  5e5c3d0246ae 8 seconds ago  70MB
user/app    build  d069a7d27039 10 seconds ago 725MB
  1. Provide a --intermediate-images=dangling flag to leave intermediate images as dangling images:
$ docker build --tag user/app:1.0.0 --intermediate-images=dangling .
Successfully built d069a7d27039 (intermediate)
Successfully built 5e5c3d0246ae
Successfully tagged user/app:1.0.0
$ docker images
REPOSITORY  TAG    IMAGE ID     CREATED        SIZE
user/app    1.0.0  5e5c3d0246ae 8 seconds ago  70MB
<none>     <none>  d069a7d27039 10 seconds ago 725MB
  1. Provide a --intermediate-images=remove flag to remove intermediate images:
$ docker build --tag user/app:1.0.0 --intermediate-images=dangling .
$ docker images
REPOSITORY  TAG    IMAGE ID     CREATED        SIZE
user/app    1.0.0  5e5c3d0246ae 8 seconds ago  70MB

@thaJeztah
Copy link
Member

When using the new buildkit-based builder (currently opt-in, but can be either enabled as a daemon configuration, or using DOCKER_BUILDKIT=1 environment variable), the builder no longer uses images for the build-cache, which means that intermediate stages no longer show up as un-tagged images.

Buildkit also features (configurable) garbage collecting for the build-cache, which allows for a much more flexible handling of the cache (see #37846).

I don't think intermediate stages should automatically be tagged, as in many cases those intermediate stages are not used as actual images, however, having a way to build multiple targets in a single build may be something to think about. Something like

docker build --target stage1=image:tag1, stage3=image:tag3 ...

but perhaps alternative syntaxes could be thought of, also in light of buildkit supporting different output-formats than just docker images

@dluc
Copy link
dluc commented Apr 2, 2019

The following workaround can help removing just temporary stages:

  1. Add something like LABEL autodelete="true" to stages to delete
  2. After the build execute docker rmi $(docker images -q -f "dangling=true" -f "label=autodelete=true")

e.g.

FROM mcr.microsoft.com/dotnet/core/sdk:2.2 AS build
LABEL autodelete="true"
...

FROM mcr.microsoft.com/dotnet/core/aspnet:2.2-alpine3.9 AS runtime
LABEL description="My app"
COPY --from=build ...
...

and

list=$(docker images -q -f "dangling=true" -f "label=autodelete=true")
if [ -n "$list" ]; then
     docker rmi $list
fi

@ddsharpe
Copy link

@dluc Thanks for the workaround. It is very disappointing that there is not something builtin to the build command. Multistage builds are great and help with quite a few use cases. One of those use cases is dealing with security where files need to be omitted from the final image. Another is the Angular case discussed (above) where intermediates need to stay around. Both of these cases seem quite valid, and I happen to use both. I don't understand the reluctance to add CLI that lets the user control the lifecycle of the intermediate image. From stackoverflow and several other threads, its obvious that people are having to create workaround like @dluc posted. Why not solve the problem properly?

@arvenil
Copy link
arvenil commented Jul 23, 2020

What annoys me is docker images not showing from where those images are coming from. If I use something like AS builder I would expect them to be tagged with builder#1 or something similar. Also if I make a change to the layer, the old image stays, the new one is generated, so in the end I have tons of dangling images which I don't even know from which project they come from.

@thaJeztah
Copy link
Member

This won't be fixed for the classic builder; I'd recommend trying the next generation builder (BuildKit), which is still opt-in (tracking issue for making it the default is #40379)

The easiest way to enable buildkit is to set the DOCKER_BUILDKIT=1 environment variable in the shell where you run your build #34151 (comment)

BuildKit build-cache that's separate from the image store (and can be cleaned up separately using docker builder prune)

@liuzeng01
Copy link
liuzeng01 commented Aug 3, 2020

This is not a bug; when using multi-stage builds, each stage produces an image. That image is stored in the local image cache, and will be used on subsequent builds (as part of the caching mechanism). You can run each build-stage (and/or tag the stage, if desired).

In addition, if you want to tag an intermediate stage as final image, set the --target option when running docker build (see Specifying target build stage (–target))

If these intermediate images would be purged/pruned automatically, the build cache would be gone with each build, therefore forcing you to rebuild the entire image each time.

I'm closing this issue because this is not a bug, but feel free to continue the conversation or if more information is needed.

However, there is a new question. If I change the process of building intermediate images or some dependent files had changed, many new "none" images will be generated at this time. These redundant none images need to be cleaned up.
Maybe could have a command flags to decide whether to delete the intermediate images?

@thaJeztah
Copy link
Member

docker system prune or docker image prune will remove those images. Or enable buildkit #34151 (comment), then those images are not created (at least not as part of docker build)

@HeCorr
Copy link
HeCorr commented Nov 30, 2020

the problem I'm having is every time I run docker build after changing a source file, it generates a <none> image, but there's no way, that I'm aware, of only deleting the old images (and keeping the latest one for caching purposes).

am I missing something?

edit: what I want is being able to delete all old and unused intermediate images, from previous builds but keeping the most recent one for caching.

@thaJeztah
Copy link
Member

@HeCorr no, there's no "one step" solution to only keeping the "last" intermediate steps when using the classic builder

if possible, I would recommend building with buildlkit enabled (DOCKER_BUILDKIT=1), which uses a separate store for the build-cache, and would allow you to cleanup the build-cache, but preserve (e.g.) cache for the last XX hours (docker builder prune --filter until=24h); I think it also defaults to preserving "active" build-cache (so only removing build-cache for older builds)

@HeCorr
Copy link
HeCorr commented Nov 30, 2020

@thaJeztah okay, I'll give that a try. thanks ;)

I actually came up with a dirty trick that works sometimes, which is putting LABEL package=pkgname on both FROM statements, run docker build (...), then docker rmi $(docker images --filter label=package=pkgname -q | sed 1,2d) (removes all images except the two newest ones).

@HeCorr
Copy link
HeCorr commented Dec 1, 2020

@thaJeztah DOCKER_BUILDKIT=1 seems to solve my problem. thank you very much :)

@HeCorr
Copy link
HeCorr commented Dec 1, 2020

actually.... upon further testing I concluded that buildkit presents the same space inefficiency..

here I used ncdu to analyze usage on /var/lib/docker:

all pruned ----- 2.9 GiB
first build ---- 8.0 GiB
second build --- 9.7 GiB
third build ---- 11.4 GiB
image prune ---- 11.4 GiB

@thaJeztah
Copy link
Member

@HeCorr did you run docker builder prune to cleanup the build-cache? Note that automatic garbage collection is also possible (but documentation is still missing currently; see docker/cli#2325)

@NinoFoxx
Copy link
NinoFoxx commented Dec 1, 2020

@HeCorr did you run docker builder prune to cleanup the build-cache?

I did but it, well.. cleans the cache, which is not what I want.

I wanted it to only keep necessary stuff, so there is a cache, but it doesn't grow like 2GBs on each build..

@thaJeztah
Copy link
Member
thaJeztah commented Dec 1, 2020 via email

@HeCorr
Copy link
HeCorr commented Dec 2, 2020

oops.. wrong account. anyway, running docker builder prune --keep-storage 8g returns Total reclaimed space: 0B and ncdu still shows 11.4 GiB...

I'm honestly starting to reconsider using Docker on my projects...

@tonistiigi
Copy link
Member

@HeCorr What's the output of docker system df and where is the storage being used if you go deeper into the /var/lib/docker directory.

@HeCorr
Copy link
HeCorr commented Dec 4, 2020

@tonistiigi I updated docker-ce and docker-ce-cli from version 5:19.03.13~3-0~ubuntu-focal to 5:19.03.14~3-0~ubuntu-focal before running the commands below, but I don't think anything changed.

First build

docker system df:

TYPE                TOTAL               ACTIVE              SIZE                RECLAIMABLE
Images              1                   0                   7.948MB             7.948MB (100%)
Containers          0                   0                   0B                  0B
Local Volumes       0                   0                   0B                  0B
Build Cache         19                  0                   860.7MB             860.7MB

ncdu /mnt/500/docker:

image

Second build

docker system df:

TYPE                TOTAL               ACTIVE              SIZE                RECLAIMABLE
Images              2                   0                   15.89MB             15.89MB (100%)
Containers          0                   0                   0B                  0B
Local Volumes       0                   0                   0B                  0B
Build Cache         22                  0                   881.9MB             881.9MB

ncdu /mnt/500/docker:

image

Third build

docker system df:

TYPE                TOTAL               ACTIVE              SIZE                RECLAIMABLE
Images              3                   0                   23.84MB             23.84MB (100%)
Containers          0                   0                   0B                  0B
Local Volumes       0                   0                   0B                  0B
Build Cache         25                  0                   903MB               903MB

ncdu /mnt/500/docker:

image

@thaJeztah
Copy link
Member

I think I see the confusion here. So there's multiple reasons why a <none>:<none> image can exist; a <none>:<none> image is an image that was not given a name (not a tag).

You initially mentioned that each docker build caused multiple of those images to be created, which is correct: when using the classic (non-buildkit) builder, images are be created each step in the Dockerfile. For example;

FROM alpine
RUN apk add --no-cache make git
WORKDIR /src
COPY ./some/files/. /src
RUN cat one two three > four

If you build the above Dockerfile with the classic builder, then each step is commited to a new image, without giving it a name

DOCKER_BUILDKIT=0 docker build -t myname/myimage:latest .
Sending build context to Docker daemon  4.608kB
Step 1/5 : FROM alpine
 ---> d6e46aa2470d
Step 2/5 : RUN apk add --no-cache make git
 ---> Running in d2e8219c765e
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/community/x86_64/APKINDEX.tar.gz
(1/7) Installing ca-certificates (20191127-r4)
(2/7) Installing nghttp2-libs (1.41.0-r0)
(3/7) Installing libcurl (7.69.1-r1)
(4/7) Installing expat (2.2.9-r1)
(5/7) Installing pcre2 (10.35-r0)
(6/7) Installing git (2.26.2-r0)
(7/7) Installing make (4.3-r0)
Executing busybox-1.31.1-r19.trigger
Executing ca-certificates-20191127-r4.trigger
OK: 22 MiB in 21 packages
Removing intermediate container d2e8219c765e
 ---> 2796b4bea033
Step 3/5 : WORKDIR /src
 ---> Running in 0ed1061614b9
Removing intermediate container 0ed1061614b9
 ---> 2338ec598826
Step 4/5 : COPY ./some/files/. /src
 ---> a156df6331da
Step 5/5 : RUN cat one two three > four
 ---> Running in 57bb16203d19
Removing intermediate container 57bb16203d19
 ---> ebf3ee95c2b6
Successfully built ebf3ee95c2b6
Successfully tagged myname/myimage:latest

In the above output, 5 images are created;

 ---> d6e46aa2470d
 ---> 2796b4bea033
 ---> 2338ec598826
 ---> a156df6331da
 ---> ebf3ee95c2b6

The first one is the base image (alpine:latest), which in my case was already present. The four following that are the intermediate steps; those are commited to an image but are not "tagged", so will show up as <none>:<none>, and the last image is tagged as myname/myimage:latest once the build completes;

docker image ls -a

REPOSITORY         TAG      IMAGE ID       CREATED         SIZE
myname/myimage     latest   ebf3ee95c2b6   5 minutes ago   22.3MB
<none>             <none>   a156df6331da   5 minutes ago   22.3MB
<none>             <none>   2338ec598826   5 minutes ago   22.3MB
<none>             <none>   2796b4bea033   5 minutes ago   22.3MB
alpine             latest   d6e46aa2470d   6 weeks ago     5.57MB

For the classic builder, those intermediate images double act as "build-cache"; when repeating the build, the builder checks if the cache can be used for each step, and if so, skips the build for that step, and uses the existing image instead. When using BuildKit, on the other hand, those intermediate images are not created; BuildKit keeps a build-cache separate from the image store.

Here's the same when building the same Dockerfile with BuildKit (after first removing myname/myimage:latest, and cleaning up the intermediate images with docker image prune):

docker image ls -a

REPOSITORY         TAG      IMAGE ID       CREATED              SIZE
myname/myimage     latest   5c80efd49bd6   About a minute ago   22.3MB
alpine             latest   d6e46aa2470d   6 weeks ago          5.57MB

As you can see, the <none>:<none> images are not created, and only the alpine:latest and the final image are present.

However, there is a second reason the <none>:<none> images can show here; if you re-build the image, and the image is modified, then a new "final" image will be created, and tagged (myname/myimage:latest). Tagging an image is similar to git tag; a tag points to a reference (commit) in the repository; in docker's case, it points to an image. However tagging a different image does not make the old image "go away"; the image still exists, but if no tags reference it, it shows as an untagged image (<none>:<none>).

If I would change the Dockerfile, and add a new RUN line;

RUN echo "something changed" > foobar

Then rebuild the image:

DOCKER_BUILDKIT=1 docker build -t myname/myimage:latest .
[+] Building 1.5s (11/11) FINISHED
 => [internal] load .dockerignore                                                             0.2s
 => => transferring context: 2B                                                               0.0s
 => [internal] load build definition from Dockerfile                                          0.3s
 => => transferring dockerfile: 194B                                                          0.0s
 => [internal] load metadata for docker.io/library/alpine:latest                              0.0s
 => [internal] load build context                                                             0.4s
 => => transferring context: 151B                                                             0.0s
 => [1/6] FROM docker.io/library/alpine                                                       0.0s
 => CACHED [2/6] RUN apk add --no-cache make git                                              0.0s
 => CACHED [3/6] WORKDIR /src                                                                 0.0s
 => CACHED [4/6] COPY ./some/files/. /src                                                     0.0s
 => CACHED [5/6] RUN cat one two three > four                                                 0.0s
 => [6/6] RUN echo "something changed" > foobar                                               0.5s
 => exporting to image                                                                        0.3s
 => => exporting layers                                                                       0.1s
 => => writing image sha256:116eecf6c6d69a6e7763e7ec85e13ec42cf04064b39e17b50924132c15838c64  0.0s
 => => naming to docker.io/myname/myimage:latest                                              0.0s

You'll see that all steps (except for the last one) are cached. The final image produced has digest sha256:116eecf6c6d69a6e7763e7ec85e13ec42cf04064b39e17b50924132c15838c64, and is tagged as docker.io/myname/myimage:latest

Now, looking at the images present (including "untagged" images);

docker image ls -a

REPOSITORY         TAG      IMAGE ID       CREATED              SIZE
myname/myimage     latest   116eecf6c6d6   2 minutes ago        22.3MB
<none>             <none>   5c80efd49bd6   4 minutes ago        22.3MB
alpine             latest   d6e46aa2470d   6 weeks ago          5.57MB

You can see that myname/myimage:latest references 116eecf6c6d6 (as mentioned above). The second image in the list is untagged, but if you look at the image-ID (5c80efd49bd6) that's the image that was previously tagged as myname/myimage:latest.

BuildKit doesn't need that image for its cache, so you can run docker image prune to remove the untagged image between builds;

docker image prune

WARNING! This will remove all dangling images.
Are you sure you want to continue? [y/N] y
Deleted Images:
deleted: sha256:5c80efd49bd638c2ed78a9799a39e80f1a879b0edaf9b0b56fa18de19f96a113

Total reclaimed space: 0B

(in this case, no space was reclaimed, because the last image added a new RUN to the Dockerfile, producing an extra layer; all previous layers were shared with the old image, so the image was removed, but non of its layers).

Note that if you're cleaning up between builds, to remove untagged images (docker image prune) before removing unused build-cache (docker builder prune). docker builder prune removes "dangling build cache", which means, cache not associated with images that are still present.
For that reason, it's more convenient to use docker system prune, which will cleanup resources in the correct order (containers -> images, networks -> build-cache)

@HeCorr
Copy link
HeCorr commented Dec 4, 2020

docker system prune clears the entire build-cache. is that expected behavior?

I ran it right after docker build --rm -t image .... what makes a build-cache 'dangling'?

same goes for docker builder prune... am i missing something? all I want is to only clear old cache, so I don't have 11GB+ of used space after only 3 builds but still have some cache so it doesn't take 3 minutes to build a simple Go package.

@thaJeztah
Copy link
Member

docker system prune clears the entire build-cache. is that expected behavior?

I ran it right after docker build --rm -t image .... what makes a build-cache 'dangling'?

@tonistiigi @tiborvass ^^ you may be more familiar how it checks for "dangling"

@HeCorr
Copy link
HeCorr commented Dec 4, 2020

would it be possible to reuse the same intermediate image for all builds? that would also serve as a cache, since everything is already downloaded, and it would also speedup the go build command since it wouldn't have to rebuild all the dependencies every time.

I noticed that the docker build accepts the --cache-from arg. could that help me? if so, how would I use it with Buildkit?

@thaJeztah
Copy link
Member

It should already use the cache, unless your dockerfile "busts" an early stage, which causes all following steps to not use the cache. This may happen if (e.g.) you use COPY . ., which copies all files to the container, including (possibly) files not needed for the build (temp files, the .git directory, etc). Order of steps can matter to optimise your cache, and to allow layers to be reused. https://www.youtube.com/watch?v=JofsaZ3H1qM

@tonistiigi
Copy link
Member

what makes a build-cache 'dangling'?

"dangling" is a term used for an unnamed image. Usually, an image that has lost a name because another image with same name has been created. Builder cache is not dangling.

docker system prune clears the entire build-cache. is that expected behavior?

Yes. If you want to clean up dangling images you run docker image prune. For build cache docker builder prune. If you don't want to clean everything but only old and least used you run something like docker builder prune --keep-storage 2GB to keep 2GB of cache.

@HeCorr
Copy link
HeCorr commented Dec 4, 2020

Builder cache is not dangling.

then why does docker builder prune say WARNING! This will remove all dangling build cache.?

ok, this is kinda confusing but I found the magic number that seems to work for me...
the first build leaves 860.7MB in the Build cache. so I used docker builder prune --keep-storage 860MB, but it didn't clear anything. so I tried 850MB, still nothing. but 840MB seems to do the trick.. running it after the second build takes me back to 860.7MB (from 881.9MB) on Build cache and running build again (after changing a source file, as always) still uses everything from the cache.

I guess this answers my questions, although ncdu still shows docker using 10GB+ on multiple BTRFS subvolumes... why? I don't know. should I create an issue named "docker build creates excessive undisclosed BTRFS subvols"? docker system df never showed more than 1GB on the build-cache...

either way, I'll stop bothering you guys with this now.

@tonistiigi
Copy link
Member

I guess this answers my questions, although ncdu still shows docker using 10GB+ on multiple BTRFS subvolumes... why?

Most likely just because ncdu can't understand that different btrfs subvolumes share the same file blocks. df or btrfs filesystem df show the used storage. ncdu just does statistics based on file metadata it reads afaik.

@HeCorr
Copy link
HeCorr commented Dec 4, 2020

heh. I had a feeling that was the case but had forgotten the command for checking it. thanks and sorry.

@thaJeztah
Copy link
Member

WARNING! This will remove all dangling build cache.?

^ @tonistiigi @tiborvass we should fix that message indeed if it does not check for cache related to "current" images (I expect that message was just copied from the other prune commands)

@venomdev
Copy link
venomdev commented Apr 12, 2021

The solution I'm using to remove the untagged images is to save the final image, remove the image which removes all the associated untagged images, and then load the final image back in. This then only leaves the single tagged image.

@x-yuri
Copy link
x-yuri commented Oct 8, 2024

This is not a bug; when using multi-stage builds, each stage produces an image.

At first it sounded wrong, because with the classical builder each step produces an image. But I see now why you said that. docker images by default shows only target and stage images, and docker images -a shows that and intermediate images.

@thaJeztah thaJeztah added the area/builder/classic-builder Issues affecting the classic builder label Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3D11
Development

No branches or pull requests

0