Skip to main content

Orchestrating Drupal + CiviCRM containers into a working site: describing the challenge

In my previous posts, I've provided my rationale for making use of Docker and the microservices model for a boutique-sized Drupal + CiviCRM hosting service. I've also described how to build and maintain images that could be used for the web server (micro) service part of such a service.

The other essential microservice for a Drupal + CiviCRM website is a database, and fortunately, that's reasonably standard. Here's a project that minimally tweaks the canonical Mariadb container by adding some small configuration bits:

That leaves us now with the problem of "orchestration", i.e. how would you launch a collection of such containers that would serve a bunch of Drupal + CiviCRM sites. More interestingly, can we serve them in the real world, over time, in a way that is sustainable? i.e. handle code updates, OS updates, backups, monitoring, etc? Not to mention the various crons that need to run, and how about things like CiviMail and while we're at it the ability to migrate in existing sites, and maybe even auto-generate new sites? And can we give our clients some kind of access to their site other than via Drupal? e.g. phpmyadmin, and/or sftp? And what about developer access and workflow?

All of those things are possible, and there are generally well-established ways of doing these things on a regular server setup. With Docker, we now have the challenging task of moving all these processes into a micro-services architecture. At the same time, we've got new opportunities that Docker and the container approach presents. Fortunately, many of these things have already been done for us, in some form or other, so the challenge is often to choose the existing solution or class of solution that is most appropriate to our needs.

What is Orchestration?

The word "orchestration" is excellent because it conjures up the image of a diversity of instruments all working together to create a coherent whole. It can also a bit misleading in that it suggests that you're looking for a single technology that will accomplish this, and orchestration is often used in this way, as if it's a well-defined problem for which you have to pick, say Docker Swarm or Kubernetes.

I prefer to use the term loosely as an ill-defined vision in the metaphorical sense I used it first.

For my current solution, I'm using Docker Swarm and not Kubernetes, but that might change in the future. Since that choice is contentious, and in case you care, here are my main reasons:
1. Docker Swarm is built into Docker
2. It does what I need it to do
3. It uses the same file definition structure (the compose file) that is used by docker-compose.

By avoiding Kubernetes, the environment in which a production site runs is reasonably reproduceable on a local development machine using docker-compose, which is a nice bonus.

Here's a more technical description of what real-world orchestration requires:

1. Managing persistent data.

Recall that containers are ephemeral. That means, you should always run the thought experiment that they could vanish at any point, so you need to have a system that can save the state of a site and make use of that state to restore a site, with a possibly altered codebase.

For Docker that means you need to use volume containers and you need to decide where the filesystem for those containers lives and how they are mounted into the codebase. There's definitely more than one way to do this. For example is the Drupal code base part of your image, or your persistent filesystem?

2. Launching and maintaining the containers into the collection of micro services that make up the application.

Which images do you use for each microservice? How are they networked together? Where/how do the persistent volumes get attached?

Docker provides the docker-compose file format as a nice to way to describe one answer to these questions. Of course, each site is different - e.g. the name of the database, civi/php versions, contributed modules, etc. Some of those differences are relevant to Docker and some aren't. For example, the php version has to be baked into the image. The Drupal/Civi versions can be baked in, but don't have to be, and how you maintain those versions will depend on whether you did.

3. Routing

On a really simple setup, you can tell Docker to map the host server's port 80 to a specific site container's port 80. Of course, that doesn't work if you want more than one site per server. So how do you get the right container answering a particular url request, based on the domain of the request?

Docker Swarm can provide part of an answer here - you can map each "website" to a specific port and then you don't have to keep track of where each individual container's ip is. In fact, that's the key step from docker-compose to Swarm, you've abstracted the "service" concept away from the individual container that provides the service into a thingy that maps a designated port to an available container on an arbitrary collection of host servers, allowing you to scale your service horizontally.

One standard tool for doing the routing is a container called "Traefik", but in a subsequent post, I'm going to provide a method using one of my favourite tools called "Varnish".

So, this is how I describe the problem, and I'll use the Simuliidae project to describe the answers that are working for me.

Popular posts from this blog

IATS and CiviCRM

Update, Nov 2009: I've just discovered and fixed a bug I introduced in the 2.2 branch for the IATS plugin. The bug was introduced when i updated the API files from IATS and failed to notice that the legacy method for C$ one-time donations was no longer supported.
If you're using a version greater than or equal to 2.2.7, and are using IATS for C$, non-recurring donations, then you're affected.
To fix it edit the file : CRM/Core/Payment/IATS.php, and remove the line that looks like this:

$canDollar = ($params['currencyID'] == 'CAD'); //define currency type The full fix removes a conditional branch based on that value a little further on, but by removing this line, it'll never actually use that branch. Drop me a line if you have any questions.
Update, May 2009: This post is still getting quite a bit of traffic, which is great. Here are a few important things to note:
The IATS plugin code is in CiviCRM, you don't need to add any code.You do still …

The Tyee: Bricolage and Drupal Integration

The Tyee is a site I've been involved with since 2006 when I wrote the first, 4.7 version of a Drupal module to integrate Drupal content into a static site that was being generated from bricolage. About a year ago, I met with Dawn Buie and Phillip Smith and we mapped out a number of ways to improve the Drupal integration on the site, including upgrading the Drupal to version 5 from 4.7. Various parts of that grand plan have been slowly incorporated into the site, but as of next week, there'll be a big leap forward that coincides with a new design [implemented in Bricolage by David Wheeler who wrote and maintains Bricolage] as well as a new Drupal release of the Bricolage integration module.PlansApplication integration is tricky, and my first time round had quite a few issues. Here's a list of the improvements in the latest version:File space separation. Before, Drupal was installed in the apache document root, which is where bricolage was publishing it's content. This …

Me and varnish win against a DDOS attack.

This past month one of my servers experienced her first DDOS - a distributed denial of service attack. A denial of service attack (or DOS) just means an attempt to shut down an internet-based service by overwhelming it with requests. A simple DOS attack is usually relatively easy to deal with using the standard linux firewall called iptables.  The way iptables works is by filtering the traffic based on the incoming request source (i.e., the IP of the attacking machine). The attacking machine's IP can be added into your custom ip tables 'blacklist' to block all traffic from it, and it's quite scalable so the only thing that can be overwhelmed is your actual internet connection, which is hard to do.

The reason a distributed DOS is harder is because the attack is distributed from multiple machines. I first noticed an increase in my traffic about a day after it had started - it wasn't slowing down my machine, but it did show up as a spike in traffic. I quickly saw that…