As discussed in the first part of this series, we were very excited when we figured out how to properly build Docker images, until we realized that we had no idea how to run them in production. You might have already guessed that we were pondering building our own tool
Requirements for Running Containers
First, we looked at a few very interesting projects, such as Kubernetes. We read their github like a great novel, full of interesting discussions and design decisions. But since at Grammarly, we are changing the engine of the car while it is barrelling down the freeway, we have to find ways to upgrade our platform incrementally. Besides, we have a monitoring framework, which also requires rethinking because it's tightly coupled with Chef.
Our original intent was to enable developers to benefit from containers without epic infrastructural changes. The main goal is to transfer ownership of services’ runtime environment to developers. As a first step, we decided to have one service per instance and use EC2 tags to identify role and environment. Also, we wanted everything we deliver on servers be delivered in containers. No underlying provisioning tricks. No Chef. This might seem an unnecessary limitation, but we believe it is a powerful abstraction. Of course, there is an operating system and things like Docker daemon to consider, but this is another layer and we will talk about it soon.
The first thing we had to figure out: Where is the container’s run configuration? Dockerfiles are meant to be the place where such properties can be specified (e.g.
CMD) for future materialized containers. But, in reality, there are different environments and conditions in which containers can be launched, so these properties might be overridden (e.g.
docker run CMD). Also, extra properties might be added, such as volumes, links, hostnames, port mappings, etc. You can think of a Dockerfile as a build and of a “run spec” as a release stage of the 12-factor’s “Build, release, run”.
Here are the key properties of the run configuration we considered:
- A run configuration (we call it manifest) should be specified in a file and checked to the git repo of a particular service. This will enable developers to own and control how their services are started as well as to iterate changes to a runtime within service code changes.
- As we found during prototyping, most of our applications consist of multiple interrelated containers, therefore such specifications should be supported by our manifests.
- There should be a tool that can apply the manifest against any host and run the containers. Hence, the tool can run either on a developer’s machine or on a production server.
- Manifests can change over time, and the tool should be able to granularly apply those changes; if only one container should be recreated to implement the change, the others should not be touched.
There is an official docker-compose tool that we have considered for deployment. However, we have found that it is missing a few key features, which renders it unusable for us. We need something that is designed to be a deployment tool in the first place and be useful for development as a bonus (`docker-compose` is vice versa). Also, we believe that Docker’s microservices approach is a good idea, and so the deployment tool should also respect it.
In general, we liked
docker-compose’s method of describing applications and we made something very similar but designed to be a production deployment tool. Meet rocker-compose.
Why do we want granularity and idempotence? Consider the example of an nginx deployment. First of all, you might want to decouple configuration to be able to iterate without restarting nginx. While deploying the nginx application itself is not a big deal, delivering configuration gets trickier if you want to do it with containers only. You might also use tools like docker-gen (as a separate container) to notify an nginx container to reload, but you will likely describe both containers in a single app manifest, which requires the deployment tool to support granularity.
Here is an example of
rocker-compose YAML manifest of decoupled nginx deployment. You may also see a full example here.
Decoupling the Network
There is another important pattern — a loosely coupled network. Usually, people use links to connect containers together. But there are cases when you don’t want one container to rely on another even though they should be able to talk to each other. A good example is a StatsD agent container, which we don’t want our applications to rely on, but we do want to send metrics to it when it’s available.
host networking mode might possibly solve the issue, but then you have to run both the application and StatsD in this mode. This breaks the whole point of isolation, since you don’t control which ports are exposed to a host.
To solve this issue, we added a bridgeIp helper to
rocker-compose, which provides the bridge IP address to your application container where it can find StatsD, without relying on links and while staying in bridge network mode. However, since version 1.8, Docker populates all existing containers to
/etc/hosts file of every container. We need to research it further, and it may be that
bridgeIp is not needed anymore.
Sometimes we want to have a container as a one-off command that we won’t run again on subsequent deploys unless it has been changed. This is useful when you want to initialize some stuff prior to starting other services. Of course, there are other ways of doing it, such as wrapping your application in some script (docker_entrypoint.sh). But there are scenarios when you want to decouple initialization from your actual applications. See what rocker-compose have to deal with it.
We wanted to be able to run multiple manifests on a single server at the same time. This would allow us to split application bits from platform stuff, such as metrics and log-forwarding agents. The split is necessary, because it is managed by different teams and may have different release cycles. Note here that containers from the different manifests may interact with each other and docker linking will not help here, because the manifests are managed separately. The loose coupling techniques listed above are essential in this case.
Our Approach to Server Provisioning
At Grammarly, we have the following layers of provisioning:
- OS layer: operating system with a pre-installed Docker daemon. We use
linux-generic-lts-vividkernel to support Overlayfs, since we had problems with devicemapper. With packer, we prepare the AMI and spin all our servers from it. Immediately after server creation, we also copy docker certificates and ssh keys.
- Platform layer: on every server, we run the aforementioned stack of common agents. Refer to this manifest to see what’s inside.
- Application layer: basically, these are the high-level services that we run. Most commonly, we have one application running per instance. But there are cases when we have multiple applications, for example, for small utility services or QA stacks. Here’s an example of an application manifest.
A manifest that describes "platform" agents that we run on every instance:
You may have noticed the
sensu_client container in the platform manifest. We use Sensu to gather system metrics from instances and also as a framework for writing checks and handling them. Wiring sensu-client with the host machine from the container was challenging, and it deserves its own mini article.
We use Ansible for spinning up new instances, performing initial provisioning, and for executing
rocker-compose on them. We plan to move to Terraform to handle more advanced scenarios, such as attaching EBS volumes to new instances or spinning up complex services that require coordinated bootstrap. It may be that we can adopt rocker-compose as a provisioner.
As mentioned earlier, we have quite a dumb (but working!) service discovery. We use EC2 instance tags to mark instances by “environment”, “role”, and “track”. Track is simply an additional dimension that allows us to break down groups of services for things like A/B testing. On balancers, there is a tool that polls AWS and generates nginx/haproxy configurations according to the list of available instances.
We hope to incorporate Consul in the near future to make the service discovery more robust. For example, failed services should be removed from the balancer configuration automatically. There is also a scenario where the instance is up, but an application failed to deploy on it, and a service discovery mechanism should automatically handle such cases.
In spite of the fact that Google has being running containers for more than a decade, the container ecosystem in the broader industry is at a very immature stage. Yet, we can already see the advantages of running containers even in such experimental conditions. There are many questions left to resolve, such as stability, security, and service discovery. Build systems and monitoring facilities are still lacking. Also, we are hopefully watching the progress of rkt and OCI.
Grammarly is finalizing our migration to fully containerized production services, and rocker and rocker-compose have proven to be incredibly useful tools for our engineers. We hope that you find our experience insightful and maybe even helpful as we all work to build a better devops culture.