Today, the industry is saturated with discussions about containers. Many companies are looking for ways they can benefit from running an immutable infrastructure or simply boost development performance by making repeatable builds between environments simpler. However, sometimes by simplifying the user experience we end up complicating the implementation. On our journey to a usable, containerized infrastructure, we faced a number of daunting challenges, the solutions to which are the subject of this post. Welcome to the bleeding edge!
Grammarly powers intelligent text-checking tools for millions of active users, and we continuously face scaling challenges to accommodate further growth of that user base. However, there’s a saying that scaling people is harder than scaling systems. As a platform team, we focus on introducing DevOps practices and tools to facilitate scaling both our people and our services. Grammarly has about 50 engineers and dozens of services written in Java, Lisp, Erlang, Scala, and Python, running on hundreds of EC2 instances. Currently, we are shifting our infrastructure to completely remove Chef (which did not, it turns out, support a true DevOps approach in our case) and using containers to provision our production system.
There has been a lot of discussion about why you should keep your images small and have your containers be single-purpose. Fast builds are also essential. They at least shouldn’t become slower after introducing Docker.
Lightweight images mean faster deployments. More importantly, we want to keep development tools used to build artifacts out of production. Additionally, sometimes you need to have a private key present at build time to fetch from private git repos, but you may not want to keep secrets in your final image. With Dockerfiles, there is no good way of doing that.
People have tried to solve this problem in many different ways, including using
docker-compose and other tools to orchestrate multi-image builds. Neither of these approaches seemed transparent or convenient for us. We want the user experience for our engineers to be as simple as possible.
Here are some general approaches to tackling this problem that we explored:
- You can build the artifact on a CI server and then put to the docker image; in this case you can keep your resulting image clean. But then you lose the advantages of build environment portability and isolation between different versions of software. Also, it is more error-prone, because you cannot test your builds locally.
- Make two separate Dockerfiles. One will build your artifact, the other will describe the runtime environment where the artifact will be executed. The problem here is how to interchange files between docker builds. This may require you to do something clunky, like running
docker buildfrom the first Dockerfile to deliver the artifact to the final Dockerfile.
With most language-specific dependency management systems, docker builds become much slower because cached as a docker layer and kept until you change a dependency, and then all dependencies will be fetched from scratch. We have seen scenarios where
spark dependencies for more than half an hour. This does not make our engineering more efficient! Most CI tools are optimized to keep shared directories for fetched dependencies. In case your dependency manager supports storing multiple versions of the same package, you can benefit from this caching and not download the Internet on every build.
Someone even scaled docker builds to work under Apache Mesos to make builds faster. But keep in mind that Docker does not guarantee idempotence if you build from the same Dockerfile on multiple machines.
In other words, this problem cannot be solved in any reasonable manner with Dockerfiles.
Rocker to the Rescue
Instead of writing bash scripts or creating custom YAML specs, we came up with an idea to create a tool that will make Dockerfiles extensible. This way, we can add our own primitives and work around the aforementioned problems, while keeping backward compatibility with original Dockerfiles. We named this tool rocker.
To solve the first problem,
rocker allows you to specify as many FROMs as you want in a single Rockerfile. Also, it has commands to copy files between
FROMs. This allows you to build your artifact in the first image and then transfer some files to the second one, all in a single build operation. You can keep the final image clean and light, without worrying about inadvertently putting a secret key file
Also note that different
FROMs within a single Rockerfile are cached independently. Hence, a change to a second FROM will not invalidate the first one. You can change the FROM value of the second image and the first part will remain cached. Multiple FROM instructions are working like separate Dockerfiles, except that
rocker manages file sharing between them.
More interestingly, a layer invalidation in the first
FROM will not invalidate the second one completely. It will only trigger the layers after which the file is imported in the second image. This may be difficult to grok, but such behavior has shown to be intuitive and even expected by users. Additionally, it makes development iterations faster.
To tackle the second problem,
rocker provides the MOUNT command, which works very similarly to
docker run --volume. For every subsequent
RUN rocker will mount the specified volume (either map host directory or make a data volume container) to an execution container. This allows you to specify any directory to be reused between builds to take advantage of package manager caches, like most CI servers do.
MOUNT is a nice hack that allows you to balance the ideal side-effect-free idempotence with real-world usability that allows you to avoid re-downloading all the dependencies on nearly every build. However,
MOUNT is not safe, since it may break Dockerfile’s portability and have side effects that may be difficult to revert.
With these essential commands in place, nothing stopped us from going totally berserk and adding even more, such as TAG, PUSH, ATTACH and templating. We even added INCLUDE, though we do not use it and are not sure it is needed. In this way,
rocker became a playground for experimenting with improving the convenience of Docker builds.
Building images is still an open question in the industry. Perhaps, after OCI stabilizes, we will find a less hacky approach. Our instrument is far from becoming an industry standard, but it solves our real problems. Besides that, we have several ideas on how it can be improved and become more generally useful to the community:
- Rocker doesn’t need to use Docker server for builds. As of Docker 1.8, there is a new API that enables client-side builds. This is much more efficient, but requires re-implementation of all native Dockerfile commands, such as
COPY, and also keeping up with Docker's updates. The positive side is that we’ll have more control over the behavior.
- The caching [idempotence] problem is not solved completely: Docker's distribution model is tightly coupled to its layering approach. Since the cache is machine-bound and layer ID is not a shasum but a random sequence, builds on a different machine will cause layer invalidation and re-creation during deployments. Thus, both Docker and rocker guarantee idempotence only as long as you build on a single machine. It would be great if we could find a way to solve this problem.
- Rocker could become the engine for making extensible Dockerfiles, letting everyone add their own custom commands. This is a danger zone because it breaks portability. And it’s already a problem for rocker itself: Rockerfiles cannot be built by hub.docker.com or others who can build native Dockerfiles. There is an alternative crazy tool that also makes Dockerfiles extensible.
If you have had a similar experience with Docker, please share your comments and solutions. In the next part we’ll describe our approach to lightweight production deployment of containerized apps. Stay tuned.
Update: read part 2 — How We Deploy Containers at Grammarly.