Building Effective, Secure Container Delivery Pipelines with Docker, rkt et al.
Opinions on how best to package and deliver applications are legion and, like many other aspects of the software world, are subject to recurring trend cycles. On the server-side, the current favorite is container delivery: a "full stack" approach in which your application and everything it needs to run are specified in a container definition. That definition is then "compiled" down to a container image and deployed by retrieving the image and passing it to a container runtime to create a running instance. Here, I'd like to talk about how we can apply lessons from experience of shipping code using many different formats in order to build effective, secure Container Delivery pipelines.recent discussions about the number of insecure images in the Docker Registry shows, we need to find a way to add more Ops-side input into the container delivery process than is common today. The points below assume that the development team is indeed ultimately responsible for the definition of the entire system. Different models are possible with containers, too. For example, you could have a PaaS-like model in which the developers provide app components that are automatically combined with an Ops-owned container definition. However, I would not call this a container delivery model because here the container definition is not the deliverable, but a runtime implementation detail. Here, then, are my "food for thought" discussion points for building effective, secure container delivery pipelines.
1. Developers provide container definitions in SCMThe container definition - Dockerfile or other, higher-level definition that "compiles down" to a container descriptor - is the source deliverable, and so should be stored as a versioned artifact in a source control repository.
2. Container definitions and dependencies are compiled to container images in an Ops-controlled environmentConcretely: the build or CI system that generates the container images from the container definitions should not be owned or administered by the development team. This helps ensure that all Ops-related checks that need to be executed against the container definitions and associated dependencies are carried out correctly. Implementing this recommendation does not require the entire CI setup to be controlled by Ops. For example, you could limit direct publish access to your image registry to an Ops-managed "image build service", which could be called from a developer-run CI server.
3. Minimal base image catalog is enforcedJust as many build environments for application code enforce a whitelist of libraries and other allowed dependencies, your container pipeline should only process container definitions that inherit from a whitelisted set of supported base images. From a maintenance perspective, this whitelist should be as small as possible. If exceptions need to be made - a particular project requires a component that can only run on a specific OS, for example - these should be limited to container definitions for that specific project.
Many companies are challenged with how to run containers at scale and standardize and manage release processes across hundreds of applications. Learn from Rob Stroud, XebiaLabs CPO and former Forrester Analyst, how to bridge the gap between the promise of containers and the realities of complex enterprise application delivery.
4. Developer-provided container definitions can be pre-processed to choose a different base imageRequiring development teams to manually update their container definitions in source control whenever the base image whitelist is updated, e.g. after a security patch, is tedious and creates unnecessary delay. Instead, the image build system should be able to automatically choose a different base image, if necessary, with suitable notification back to the development team. Container definition formats that allow symlink-style image references, such as Docker via the latest tag, can support this out-of-the-box. However, you may want to exert more fine-grained control over base image choice, such as automatically replacing a reference to an explicitly-specified version such as v10.4.8 by v10.4.9 if 10.4.9 contains an important security patch.
5a. Developer-provided container definitions can be scanned to enforce particular policiesIn general, even though image inheritance means that developers generally don't need to mess with lower-level system settings in their container definitions, nothing prevents them from doing so. For example, the container definition could change the security configuration of the OS, install insecure versions of libraries, create open mail relays etc. etc. Ideally, code review that includes Ops will prevent such changes from ever making it into the container definition. You will most likely also be running security scans against your running container instances to try to catch such problems after the fact. The ability to automatically check for "problematic" parts of a container definition at image build time - having a linter for container definitions, if you will - is an additional tool that should be in your toolbox, however.
5b. Developer-provided containers can be "black box" tested to enforce particular policiesIf the previous point can be described as "white box" testing of container definitions, then this point is about black box testing of the resulting image: ensure that your image build system is able to create instances of new container definitions in a safe/sandbox environment and run assertions (using a tool such as Cucumber or similar) against them.
6. Images in your registry that were created from a particular base image can be invalidatedThe ability to enforce a base image whitelist at build time should prevent any new container images from referencing insecure or otherwise unsupported base images. But what about all those images already in your registry that inherited from such a base image? How do you prevent new container instances being created from those? Consider implementing a system that allows you to check, just before spinning up a container instance from an image, whether that image is still "safe". This can be as simple as creating a wrapper for your 'instantiate container' command that checks for the absence of an 'unsupported' tag or other piece of metadata on the image, or as advanced as a plugin or extension that hooks directly into your container runtime.
7. Images derived from a "banned" base image can be rebuilt using an updated base image automaticallyWhen a base image is banned, you want to be able to immediately trigger new container image builds, using an updated base, for each container definition inheriting from the now banned base image. Otherwise, you won't have any runnable container images for that application until the next code change, or until someone manually triggers the delivery pipeline for that app, since the latest version is now "unsafe". Note that I'm not trying to recommend this as a "standard" part of the image build process - the correct way to create new container definitions once a base image is invalidated is definitely to update the container definition in source control and allow the pipeline to build a new image version based on that. Rather, this capability is a stop-gap solution to help bridge the gap until the new image is available.
8. Post-deployment commands can be automatically run against running containersWhile all the new image versions are building, you'll still have plenty of running container instances that use the now banned base image. In that case, it's very useful to be able to specify commands to be run automatically against all container instances that meet a certain condition (such as inheriting from a particular base image). Modifying containers at runtime in this way is generally frowned upon, and this capability should again only be considered a temporary fix until new image versions based on updated container definitions in source control have been built, and all running instances have been updated. But if you have large numbers of running container instances and many images that need rebuilding (which might take quite some time - think build storm), the ability to quickly apply an "emergency band-aid" can be essential. Even if it only takes a few minutes until all the new images have been built and all running container instances have been updated, that can still be a big problem in the face of a critical security vulnerability.
With thanks to Boyd Hemphill for his thoughtful review comments, and Reddit user squeaky19's feedback on the original pipeline diagram.XebiaLabs develops enterprise-scale Continuous Delivery and DevOps software, providing companies with the visibility, automation and control to deliver software faster and with less risk. Learn how…