Speed up your Docker builds with dotnet-subset

Ted Spence
tedspence.com
Published in
4 min readJan 26, 2024

--

Your dotnet applications can make better use of Docker build caching

My team chose to use containers to speed up our developer environment process. Now that we use containers, we can get a new developer up and running in hours instead of weeks. But there’s another challenge — how can we make our build times as fast as possible?

Fortunately Docker has a lot of tools available to improve build times. The most important is its ability to detect file changes and avoid rebuilding things that haven’t been modified.

Let’s take a look at how we can use this approach to improve the build speed of our DotNet applications, and specifically at how DotNet Subset can enable us to reduce the time required to download NuGet packages.

Take a slice of time out of your dotnet builds in Docker with subset (Wikimedia Commons)

Arrange your dockerfile to make use of caching

The first step in this process is to recognize that Docker will automatically detect when a step needs to be updated. Your dockerfile is a list of steps, and each step checks to see if files have changed since it was last done.

Because of this behavior, if you have two applications that share multiple steps, you can speed up your build process by making the top of each file match. Then, Docker will execute each process once and share the results among each build.

Once you’ve done this, you can look at a full rebuild done using docker compose up -d --build and see where you spend all your time. For me, a significant amount of time was spent in the dotnet restore step. What can I do to improve this?

How docker handles npm install

The build processes for NodeJS and DotNet include a step where packages are downloaded from a central package repository. For NodeJS, this is NPMJS; for DotNet, it is NuGet. This download process is slow and generates lots of disk activity, but when you are building apps locally on your computer it caches the results.

Docker would like to do caching as well, but caching and immutability don’t work well together. Instead, Docker aims to be a deterministic build process — it wants to be able to prove that a restore needs to happen, and trigger it if and only if the restore needs to happen.

In the case of a NodeJS application, it’s possible to improve build times significantly by doing the following steps in a precise order:

Why does this help?

  • The results of line 4 can be cached as long as the package.json file has not changed.
  • The results of lines 5, 6, and 7 are deterministic as long as the package.json file has not changed.
  • Therefore, Docker can cache the results of the dockerfile process all the way up to step 10 — which means it can avoid most of the work of the build process.
  • When you do make a change to the package.json file, Docker will notice and start at line 4.

So the next question is, can we do the same thing for DotNet applications? By default, dotnet restore won’t work unless every single one of your csproj and sln files exist. For NodeJS you can just copy one package.json file, but for a modern DotNet application you’d need to copy up dozens of different .csproj files. This introduces a potential bug: If you add a new project to your solution but fail to add it to Docker, you’ll get weird errors.

Is there a better way to do this?

Using DotNet Subset

I recently came across a blog post on NimbleWays introducing DotNet Subset. This tool aims to accomplish exactly the same benefit we get from NodeJS arrangement, but for DotNet applications. Here’s how it works.

  • Your DotNet application needs to run dotnet restore followed by dotnet publish.
  • Using the “two stage” approach, our goal is to make the dotnet restore command cacheable unless we change the .sln or .csproj files.
  • DotNet Subset can analyze your code to produce a “layer” that conatins only the .csproj files needed for a solution.
  • Once you’ve run DotNet Subset, you can then run dotnet restore on that layer, and the results will be cached properly.

Using this approach, I was able to shrink my build times from 400 seconds down to about 120 — and in some cases, as little as 30 seconds!

Here’s what the updated Dockerfile looks like:

How does it work?

Let me attempt to explain why this helps the performance of your Docker build process.

  • First, we produce a layer called the restore-env containing a full snapshot of your entire source code.
  • Running dotnet-subset on it produces an output folder that only contains the necessary csproj files.
  • We then take this output folder and move it to a new container called the build-env. By moving only the output of the dotnet restore command, we tell Docker that it can preserve these results as long as the csproj files haven’t changed.
  • Now that we have a fully restored environment, we can resume our Docker build process, and all our NuGet packages have been cached.

What’s even better is that this process will now accurately detect changes to only those csproj files that are required for each build. If you make a change to dependencies, only the required applications will be rebuilt.

Here’s to happy and fast containers!

Ted Spence heads engineering at ProjectManager.com and teaches at Bellevue College. If you’re interested in software engineering and business analysis, I’d love to hear from you on Mastodon or LinkedIn.

--

--

Software development management, focusing on analytics and effective programming techniques.