uBuild: Fast and Safe Building of Thousands of Container Images
October 6, 2022 / GlobalIntroduction
At Uber, we run thousands of different containerized microservices to provide our core services. Most of these are using a continuous deployment strategy, where every commit is automatically built and deployed as shown in the diagram below. With most services being placed in a few large monorepos, this means that a single commit changing a common dependency can trigger thousands of Docker image builds. Combined with a high commit cadence, this amounts to about 100,000 container image builds every week, with high-volume days seeing more than 25,000 builds. The focus of this article will be on how we build container images at Uber’s scale with both speed and safety in mind.
uBuild: Container Image Building as a Service
uBuild was created in 2015 to provide a common platform for all container image builds at Uber. It serves to abstract away the underlying complexities of the build system and provide a uniform interface for other engineers to easily build container images that are compliant and configured to our infrastructure. Having a centralized service for building container images enables a single team to control dependencies, base images, and optimize build latencies, so other engineers do not have to worry about infrastructure and can focus on their own domains.
In early 2022, a major modernization of uBuild was completed. In this blog post, we share the end result and some experiences from this process. This will explain how Uber builds container images at scale, with a glimpse into uBuild’s internals. We will describe the build system’s architecture, touching upon topics such as how it:
- Isolates untrusted user code being run on privileged hosts
- Leverages a range of caches to optimize build times
- Manages multi-gigabyte Git repositories efficiently
As we will see, these optimizations improved build times dramatically. All in all, we will reveal how uBuild provides a safe, optimized, and painless image-building experience for Uber’s engineers.
System Architecture
To begin, we will walk through the design of uBuild. This will serve as a high-level introduction to the system, before we dive into some of the more technical details. Figure 2 illustrates the overall architecture.
Control Plane
The uBuild control plane is a microservice that has two key tasks:
- It provides an API for external systems to interact with uBuild (e.g., to trigger builds)
- It maintains a global view of build metadata, recording key information about every build (e.g., build status, associated Git metadata, etc.)
Job Executors
When a build job is created, it will be prepared and eventually scheduled on a job executor. This job executor is a generic component that is capable of running a number of different jobs, one of which is the container image build process. uBuild uses Buildkite as its job executor. A job executor node consists of two colocated components, a Buildkite agent to start the build job, and a cache manager to prepare various build caches, which will be explained later. When the executor receives a build request, it will set up the environment for a uBuild CI job and kick off the build process by spawning a build logic container to orchestrate it.
Build Logic
When a build logic container is spawned, it will generate a build plan based on the request. This plan is dynamic and differs based on the programming language, the build tool, and other configurations for the selected artifact. This plan will, among other things, determine which commands to run during the build, the container to run these commands in, and the set of caches to provide, to achieve the best possible performance.
Most builds use an in-house, YAML-based template for describing the build process. This allows users to specify a minimal file with dependencies and custom build steps, which uBuild translates into an optimized Dockerfile that is compliant with the Uber infrastructure. uBuild can also build a container image from a Bazel rule or a plain Dockerfile, but this usually results in slower build times, as it is up to individual service owners to optimize their image accordingly.
Eventually, uBuild will trigger the actual container build in a new isolated container. If the target is a Dockerfile, Makisu will be selected for building. As this concludes, the image is pushed to Kraken, our peer-to-peer-based image registry.
Reconciler
In the event of an outage, processes exist to bypass parts of the system (e.g., to skip the control plane and trigger builds directly on an executor node). This is necessary because uBuild builds all microservices at Uber, including itself. It is also possible to bootstrap the system on a developer laptop in case of catastrophic failures. A reconciliation microservice continuously monitors the content of our internal image registries, ensuring that the metadata of images built outside the normal path is collected and stored in the build metadata database, allowing us to track the origin of any code running in production.
Job Execution via Buildkite
Similar to other systems at Uber, we adopted Buildkite for job execution. Buildkite is a flexible CI platform, allowing a CI job to be defined as a pipeline with a number of steps. Importantly, it allowed us to keep uBuild’s job executor infrastructure on dedicated hosts in our own data centers, while using a managed solution for scheduling jobs and viewing build logs. Our Buildkite pipeline itself is relatively simple, since its primary objective is to spawn the build logic container, which then does all the heavy lifting to orchestrate the build job.
At a high level, this means that a build job proceeds as shown in Figure 3. An interesting observation was that with this setup, running concurrent build jobs on the same host had little impact on overall performance with proper cache management. In fact, we found that jobs building container images tend to multiplex well, with network-, IO-, and CPU-intensive phases of concurrently running jobs mixing satisfactorily. We therefore ended up configuring each host to run up to 10 concurrent builds, reducing the overall size of our build fleet with negligible latency costs.
Isolation of Privileged Tasks
Building microservices is inherently a privileged task, which requires access to a number of secret elements, such as keys for pushing to container registries or secrets specific to the microservice being built.
Such build-time secrets must be protected; if a malicious user can obtain these keys, they would be able to build malicious versions of microservices, sign them as being created by uBuild, and push them to an internal container registry for deployment. This could be hard to detect, as the images will appear to in fact be originating from uBuild, even though the actual origin is unknown. This situation could cause security breaches and compliance violations, as it is no longer possible to accurately trace back the code running in production to its source.
This conflicts with the fact that a build process is highly customizable, and in general can involve execution of arbitrary build scripts provided by the microservice owners. In fact, if nothing was done, it would be straightforward to modify the build scripts to simply extract the secrets and publish them somewhere.
To mitigate this risk, uBuild runs a build job in a series of isolated containers, providing each container only with the privileges it requires to function. For example, the container executing build scripts has limited credentials and does not have access to any internal secrets used elsewhere in the process.
The colocated cache manager microservice serves to eliminate the overhead of using short-lived containers in the build process, by maintaining build caches and Git checkouts. We will now dive deeper into the impact we saw from introducing such a service.
Optimization of Build Latency
Fast build latencies are important for two reasons. First, it is central to a good developer experience, as it lets engineers quickly iterate on code changes by building and deploying their change to a staging environment or running containerized integration tests. Secondly, it is critical for the business, when emergency changes must be deployed to production to mitigate an outage. It is therefore natural that there is interest in minimizing the build latencies as much as possible.
The scale of container images built at Uber (more than 100,000 per week) naturally strains the container build system. Besides the sheer volume, Uber uses a monorepo architecture, where a single large repository is used for each of the main languages (Go, Java, and JavaScript). This has innate performance challenges, as cloning a large repository from remote Git servers takes time and becomes slower over time for every file and Git reference that is added. As an example, cloning the Go monorepo takes around 10 minutes at the time of writing. Without optimizations, this would consume a large fraction of image build time, impacting the overall performance of container image building, as the code repository is necessary as input to the build process. Combined with the scale of builds, this also puts significant load on the remote Git servers, further slowing down the build process.
Clearly, such a slow build process is not satisfactory. This problem is addressed by the cache manager microservice, which manages caches and provides up-to-date Git clones for the build process. We will discuss both, before looking at the impact of these optimizations.
Fast Git Checkouts
The cache manager continually maintains a pool of ready-to-use Git clones of each monorepo. This means that anytime a build process is started, one of these local clones can be handed to that job and reset to the requested Git reference within seconds, instead of waiting multiple minutes on updating or cloning from remote Git servers. Each time a clone is removed from the pool, a new one is automatically created to ensure that the pool never empties.
Creating the pools across each host in the build fleet would still strain remote Git servers, as a lot of new clones tend to be created at the same time. Rather than naively cloning from the remote Git servers, clone pools are created from a bare copy of the Git repository on disk, which the cache manager continuously keeps synchronized with the remote Git repository. This is done by fetching the latest updates each time something is pushed to the Git repository, which is announced to a Kafka topic. After fetching, the changes are propagated to every clone in the pool, so they all contain the most recent Git changes.
Another common issue is that many dangling Git references slow down the performance of Git operations. Monorepos tend to accumulate a large amount of these references, which must be garbage collected periodically to keep the performance satisfactory. Unfortunately, given the size of our repositories, this is a time-consuming process that takes hours to complete. Consequently, we maintain an extra copy of the bare repository, which is garbage collected in the background. Once the GC process is done, we swap the two bare repositories, so the one that was just garbage collected is actively used. Figure 4 summarizes the Git setup.
Local Caching
A key strategy for performance is to cache as many build artifacts as possible, to avoid redundant work. We experimented with remote caching, but saw no benefits over caching locally on each host in our build fleet. In fact remote caching was often slower. Thus, we decided to build our caching solution around caches maintained locally on disk.
The cache manager therefore maintains a collection of diverse caches, which the build logic is able to dynamically mount into the build process to allow reuse of various build artifacts across multiple build processes. For example, the cache manager provides dedicated caches for Node modules, Go modules, Bazel actions, Gradle artifacts, and many more.
The cache manager coordinates access to the different caches. One factor to account for is that some cache types can not be shared across concurrent build processes running on the host, while some can. Other caches are sensitive to the specific software that they are built against. Examples include the Node module cache, where the cache must be sharded against the Node version used during the build process, and the Bazel cache, which must be sharded against the base image distribution to guard against cache poisoning.
The cache manager also continuously ensures that the caches are properly garbage collected and do not grow unbounded. It uses different strategies for this based on the cache type; examples include rotation of the cache if it exceeds a certain size, or periodic pruning using a least recently used strategy.
Specifying a new cache type with specific sharding, garbage collection, or concurrent access policies is made easy with the cache manager, enabling rapid experimentation with optimizations of the build processes.
Resulting Build Time Improvements
The combination of Git optimizations and local disk caches have helped reduce build latencies significantly. Before the modernization of uBuild, the P95 build latencies for container images built from the Go monorepo were 29 minutes. Combined with the cache manager and other architectural changes, this was reduced to 4 minutes today. The charts below summarizes P50 and P95 build time reductions for microservices in the three main repositories at Uber.
Monorepo \ Build Time | P50 (minutes) | P95 (minutes) |
Go | 8.5 → 2 (76%) | 29.0 → 3.5 (88%) |
Java | 13.0 → 3 (77%) | 23.0 → 6 (74%) |
JavaScript | 21.0 → 5 (76%) | 30.0 → 12 (60%) |
Conclusion
uBuild is a cornerstone of Uber’s microservice architecture, making it possible to build large numbers of container images fast and safely. In this post, we have covered its architecture and discussed performance optimizations done by the team as part of a major overhaul of the system. These optimizations improved build latencies significantly, decreasing container image build time by up to 88% for microservices in our monorepos. This has been central to a good developer experience, as it allows engineers to iterate much faster when they are making code changes.
Rasmus Vestergaard
Rasmus is a Senior Software Engineer on the stateless deployment platform (Up) team. He works on microservice build and deployment systems, with recent efforts centralized around continuous deployment.
Andreas Lykke
Andreas Lykke is a Senior Software Engineer on Uber’s Stateless Platform team. He has worked on the build and deployment experience at Uber where he currently focuses on continuous deployment of microservices.
Posted by Rasmus Vestergaard, Andreas Lykke