Optimizing Image Sizes
There are several gotchas that come when people begin to experiment
with container images that lead to overly large images. The first thing
to remember is that files that are removed by subsequent layers in the
system are actually still present in the images; they’re just inaccessible.
Consider the following situation:
.
└── layer A: contains a large file named ‘BigFile’
└── layer B: removes ‘BigFile’
└── layer C: builds on B by adding a static binary
创建一个大文件的文件
移除大文件
通过添加静态二进制文件在B层的基础上构建
这个层次结构演示了如何使用Docker的分层技术,从一个镜像构建出一个新的镜像。这里,一个大文件被添加到初始镜像中,然后又被删除,最终添加一个静态二进制文件。
BigFile不再出现在此图像中。毕竟,当您运行图像时,它就不再可访问
了。但实际上它仍然存在于 A 层,这意味着无论何时推送或拉取镜像, BigFile仍
然通过网络传输,即使您无法再访问它。
Another pitfall that people fall into revolves around image caching andbuilding. Remember that each layer is an independent delta from the
layer below it. Every time you change a layer, it changes every layer
that comes after it. Changing the preceding layers means that they need
to be rebuilt, repushed, and repulled to deploy your image to
development.
To understand this more fully, consider two images:
.
└── layer A: contains a base OS
└── layer B: adds source code server.js
└── layer C: installs the ‘node’ package
这个层次结构演示了如何在Docker镜像中添加应用程序。从基本操作系统镜像开始,通过添加应用程序源代码和必需的软件包来创建一个新的应用程序镜像。
versus:
.
└── layer A: contains a base OS
└── layer B: installs the ‘node’ package
└── layer C: adds source code server.js
最终生成的镜像包含所有三个层次,它是可以直接运行的,因为它包含了完整的应用环境和代码。启动容器时,容器中的 Node.js 进程将自动运行 server.js,因为在 CMD 命令中指定了 npm start。
区别是什么?
这两个图层的顺序不同,因此顺序不同。第一个图层先安装node,然后添加源代码,而第二个图层先添加源代码,然后安装node。此外,这两个图层使用不同的顺序可能会导致缓存未使用并且构建时间变慢。因此,最佳实践是尽可能将不经常更改的层放在底部,并将经常更改的层放在顶部。
It seems obvious that both of these images will behave identically, and
indeed the first time they are pulled they do. However, consider what
happens when server.js changes. In one case, it is only the change that
needs to be pulled or pushed, but in the other case, both server.js and
the layer providing the node package need to be pulled and pushed,
since the node layer is dependent on the server.js layer. In general,
you want to order your layers from least likely to change to most likely
to change in order to optimize the image size for pushing and pulling.
This is why, in Example 2-4, we copy the package*.json files and
install dependencies before copying the rest of the program files. A
developer is going to update and change the program files much more
often than the dependencies