This is a short story about how I've ended up learning more about Docker and it's associated technologies than I had planned. I'm not calling it "my docker journey" because, while Docker has done a great job of making Linux containers useable, there's no need to conflate container technology with the company. I'm a late-bloomer kind of person, not an early adopter, so it's a bit surprising to find myself in this position.
I manage Drupal and CiviCRM hosting for a collection of non-profits in Canada. I started doing this 11 years ago, in spite of planning to avoid it, and after finding out that I could do a reasonable job of it, I kind of enjoy it. I'm a mathematician by training, and a (lapsed) Quaker by religion, so by nature I have a minimalist aesthetic. Add to that, my goal with hosting is to be as invisible as possible by keeping sites fast and reliable, so I really have minimal interest in experimental technologies.
I generally do a strategic review of my hosting service about every three years - that turns out to be a reasonable compromise as far as timing goes. Curiously enough it's also the typical length of time of a union/management collective agreement. So about two years ago, I was due to upgrade my existing hosting platform and started researching where to go next. As usual, I considered the option of getting out of the business entirely. What stopped me, and what always stops me, is the lack of low-cost, reliable solutions for Drupal/CiviCRM hosting, i.e. I can do it better and cheaper myself. And I admit, I really hate it when I depend on another business and they don't deliver. You know ...
One option I did consider was Pantheon, and I was impressed that they had done so well. I'm always impressed with how smart David Strauss is, and they had been using "containers" for mass hosting of Drupal (and then Wordpress as well), since 2011. I particularly noticed how proud they were of their "density" claim - i.e. how little wasted machine resources they could get away with. So I started looking at containers, and of course very quickly found out about Docker, but also other container technologies, and it seemed like an excellent fit for what I was looking for.
Specifically - one of the problems I was looking to solve was greater process isolation. "Sharing" is a big topic, even when restricting the subject to computer servers, and for the purposes of this blog post I just want to say that it's not a matter of "shared" or "dedicated". Every Internet thing is sharing something (by the definition of the Internet). So it's a matter of what you share, with who, and how you share it. My approach has been to be very fussy about who I allow into my services, and to be fairly restrictive about what they can do on the servers. That allows me to prioritise security and performance at the cost of flexibility - e.g. I don't allow outside developers or designers to have direct access. That model has served me reasonably well, but has scaling limits, and risks, so my primary goal of the next version of hosting is to provide more process isolation between sites.
Once I'd been convinced about containers, the next question was - Docker or LXC? LXC is a different container technology that's been around for longer, but Docker was getting a lot more attention. I confess that once I'd seen Docker in action, I was also quickly converted. Here are two things that really stood out:
- Dockerfile
A Dockerfile is a simple text file with a special syntax that defines a container image. In other words, you can build a Docker image (from which a container is launched), with a simple, reliable and surprisingly quick process. What this gives you is surprising agility in building and controlling the environment (i.e. the 'virtual operating system') in which your code (Drupal + CiviCRM) is running. As an example - when a new version of php comes out - testing a copy of an existing site with the new php version is much more simple, reliable, automatable and cheap that it was.
- COW
COW means "copy on write", and it's the technology that Docker uses to efficiently, incrementally build images from other, simpler images. An image is virtual operating system environment as a file. So to build a Dockerfile image for a Drupal/CiviCRM host, you can just take a standard base operating system and build the pieces on top of that that you need, using a simple set of human-readable instructions. Or better, build on a standard php image. Or better, make use of the standard Drupal container that someone has already built. If you've ever tried to work with full-on virtual machines, it's incredible how much more efficient this is - in time, disk, cpu and memory resources. Also, brain resources.
Here's a more detailed look at the overlay file system that supports that technology.
So, yeah, Docker and containers can easily become a religion. One of the tenets of the "Docker way" is that each container should only do one thing. In other words, you don't built a single container for your application, you build it out of multiple connected containers that each provide a single "micro-service". This isn't a hard-coded rule, but it's discipline that's well worth following. And like most disciplines, there are layers of understanding, and learning when to break the rule is important also.
And that's how I've ended up learning more about containers and Docker than I'd planned.
The next questions might be: how does it work? or is it how can it work?
Tell me what you'd like to know about using containers with Drupal and CiviCRM.