Fasten Docker build

Gif for Fasten Docker Build

Context

Recently I started working on a microservices project, as a DevOps engineer my responsibility was to ensure smooth build and release of the project. One of the challenges that I was facing was the image building process of the projects was painfully slow. Following true Opstree spirit of continuous improvement I started exploring how I can fix this problem and finally got a decent success, I was able to reduce docker image build time from 4 minutes to 20 seconds. In this blog, I would like to showcase various ways through which image building can be reduced drastically.

You can find the complete code available here in this repository

Problem statement

I’m using a Springboot HelloWorld project to walk you through the problem statement and eventually the various solutions that we would be applying.

The base of the problem lies in the fact that 80-90% of image building time is consumed in downloading the dependencies defined in pom.xml. Since the scope of downloaded dependencies is limited to the image build process only, that’s why every time an image builds the complete process starts from the beginning.

FROM maven:3-jdk-8

LABEL maintainer="opensource@opstree.com"

WORKDIR /usr/src/app

ADD . /usr/src/app
RUN mvn clean package -Dmaven.test.skip=true

ENTRYPOINT ["java","-jar","/usr/src/app/target/helloworld-0.0.1-SNAPSHOT.jar"]

If we build a docker image using the above Dockerfile, it will take close to 3 minutes to build the image.

$ make problem-build-package-with-time
real	3m3.356s
user	0m1.221s
sys	0m1.067s

Solution1 | Avoid downloading dependencies

We know the solution to this problem lies in the fact “if somehow we can skip downloading dependencies” our problem will be solved. If there would have been an option to mount the host system local repository( ~/.m2) while building the image this problem would have been resolved.

We have solved this problem by moving the artifact generation part out of the image building process. Artifact generation is done using a build container having the local maven repo mounted so that dependencies would be downloaded only if not already present.

solution1-build:
	docker run -it -v ~/.m2/repository:/root/.m2/repository \
	 -w /usr/src/mymaven -v ${PWD}:/usr/src/mymaven --rm \
	 maven:3-jdk-8 mvn clean package -Dmaven.test.skip=true

Once the artifact generation is done the only thing that you have to do in your Dockerfile is to copy the generated artifact in your Docker image.

FROM maven:3-jdk-8

LABEL maintainer="opensource@opstree.com"

WORKDIR /usr/src/app

ADD target/helloworld-0.0.3-SNAPSHOT.jar /usr/src/app/app.jar

ENTRYPOINT ["java","-jar","/usr/src/app/app.jar"]

Now if we build our image post this solution the image build time will drastically reduce to ~20 seconds

solution1-build:
	docker run -it -v ~/.m2/repository:/root/.m2/repository \
	 -w /usr/src/mymaven -v ${PWD}:/usr/src/mymaven --rm \
	 maven:3-jdk-8 mvn clean package -Dmaven.test.skip=true

solution1-package:
	docker build -t opstree/fasten-build -f Dockerfile.solution1 .

solution1-build-package:
	make solution1-build
	make solution1-package

solution1-build-package-with-time:
	time make solution1-build-package >/dev/null 2>&1

$ make solution1-build-package-with-time
time make solution1-build-package >/dev/null 2>&1

real	0m15.212s
user	0m0.288s
sys	0m0.286s

Solution2 | Leverage docker layer caching

One of the very interesting concepts of docker is the caching of layers where a layer is only built if there are supposed to be some changes.

FROM maven:3-jdk-8

LABEL maintainer="opensource@opstree.com"

WORKDIR /usr/src/app

ADD pom.xml /usr/src/app
RUN mvn dependency:resolve -Dmaven.test.skip=true

ADD . /usr/src/app
RUN mvn clean install -Dmaven.test.skip=true

ENTRYPOINT ["java","-jar","/usr/src/app/target/helloworld-0.0.3-SNAPSHOT.jar"]

If you notice lines 7 & 8 are the new addition in comparison to the original Dockerfile. When the image will be built using this Dockerfile, the layers corresponding to lines 7 & 8 will be only built if & only if there is a change in pom.xml else the previously built layers cache will be used. Hence no time will be wasted in downloading the compile-time dependencies(A concept unique to maven).

solution2-build-package:
	time docker build -t opstree/fasten-build -f \
Dockerfile.solution2 .

solution2-build-package-with-time:
	time make solution2-build-package >/dev/null 2>&1

$ make solution2-build-package-with-time
time make solution2-build-package >/dev/null 2>&1

real	1m32.215s
user	0m0.788s
sys	0m0.787s

Solution 3 | Best of both worlds

Solution2 worked pretty well, the only problem in the approach was that if even there would be 1 single change of line in pom.xml the dependencies layer would be built again.

FROM maven:3-jdk-8

LABEL maintainer="opensource@opstree.com"

WORKDIR /usr/src/app

ADD . /usr/src/app
RUN mvn clean package -Dmaven.test.skip=true

In solution 3 we are sort of combining Solution1 and Solution 2 where the dependencies downloading is moved out via an intermediate builder image as shown above.

The final image build process will use this build image as a base image that will already have all the dependencies inside it. Even if there is a new dependency added in pom.xml the final image will only download that delta dependency as the rest of the dependencies would already be present via base builder image.

FROM opstree/fasten-build-builder

ADD . /usr/src/app

RUN mvn clean package -Dmaven.test.skip=true

ENTRYPOINT ["java","-jar","/usr/src/app/target/helloworld-0.0.3-SNAPSHOT.jar"]

The complete image building process would be executed as given below, please note that you don’t have to build the builder image frequently it can be a scheduled operation nightly or weekly depending on the dependencies update in your pom.xml.

solution3-build-builder:
	docker build -t opstree/fasten-build-builder \
	 -f Dockerfile.solution3.builder .

solution3-build-package:
	docker build -t opstree/fasten-build \
	-f Dockerfile.solution3 .

solution3-build-package-with-time:
	time make solution3-build-package >/dev/null 2>&1

$ make solution3-build-package-with-time
time make solution3-build-package >/dev/null 2>&1

real	0m4.800s
user	0m0.224s
sys	0m0.200s

Conclusion

In conclusion, I would like to summarise that you can either go for Solution 1/2 or 3 on case to case basis. Solution 2 will be apt when your pom.xml is stabilized and there are rarely any changes in pom.xml.

Also, you would have noticed that the Dockerfile is not written as per best practices i.e non-root user, multistage docker build that I didn’t cover intentionally. That I would like to cover in the next blog.

More importantly, I would like to stress the most important part that is continuous improvement whenever you do work it’s fine that you start with a workable solution but then you should be in constant pursuit of taking it to next level.

Feel free to give any feedback or recommendation would love to hear your thoughts or if you have another smart solution that would be great. Till we meet next time Happy Learning.

Gif reference: https://giphy.com/gifs/the-flash-Z2pZfL0YPC9sk

 

Opstree is an End to End DevOps solution provider

5 thoughts on “Fasten Docker build”

  1. Hallo Devesh,

    You start with an initial solution that has a full JDK and the app source code in the docker image. Probably maven, too.
    Why that?
    Would you not want to make the image lean and safer by not shipping unnecessary stuff? I mean to only include a JRE and the app binaries.

    My natural approach would also have a Jenkins building the binaries and store them in a maven repo. The docker build would then just download them. Not state of the art anymore?

    Regards,
    Christoph

    Like

    1. Hi Christoph,
      Thanks for reading the blog in detail & giving your opinion. I totally agree with your point, also if you notice at the end of the blog I’ve explicitly mentioned the point that you said
      “Also, you would have noticed that the Dockerfile is not written as per best practices i.e non-root user, multistage docker build that I didn’t cover intentionally. That I would like to cover in the next blog.”
      Once again thanks for pointing this out.
      Regarding Jenkins building the binary, consider a scenario where you have a concept of dynamic slave where a slave is created at runtime there you have to download the dependencies again and again that’s why Solution 3 will fit in.
      As I said earlier a problem have multiple solution and you have to find the right solution for the context in which you have problem.
      Also just to add I’m exploring docker buildkit as suggested by many people as a good alternate for this problem.

      Like

    2. Hey Devesh, Nice writeup. I have one doubt let suppose we already a dependency called X in base image and we changed same dependency’s to newer version in pom.xml then I believe final docker image will have both version of X depenency and that may cause runtime exception as same dependency with different versions are available in my clasapath.

      Like

      1. Hi Vivek,

        Thanks for reading the blog .

        The behavior would be the same as a non-dockerized build process i.e in a non-dockerized setup as well if you change a version in your pom.xml the local repository will contain both but maven build will only pick the dependency mentioned in pom.xml. Same is the case when you will build your code in a docker container

        Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s