In my previous posts, I've provided my rationale for making use of Docker and the microservices model for a boutique-sized Drupal + CiviCRM hosting service. I've also described how to build and maintain images that could be used for the web server (micro) service part of such a service.
The other essential microservice for a Drupal + CiviCRM website is a database, and fortunately, that's reasonably standard. Here's a project that minimally tweaks the canonical Mariadb container by adding some small configuration bits: https://github.com/BlackflySolutions/mariadb
That leaves us now with the problem of "orchestration", i.e. how would you launch a collection of such containers that would serve a bunch of Drupal + CiviCRM sites. More interestingly, can we serve them in the real world, over time, in a way that is sustainable? i.e. handle code updates, OS updates, backups, monitoring, etc? Not to mention the various crons that need to run, and how about things like CiviMail and while we're at it the ability to migrate in existing sites, and maybe even auto-generate new sites? And can we give our clients some kind of access to their site other than via Drupal? e.g. phpmyadmin, and/or sftp? And what about developer access and workflow?
All of those things are possible, and there are generally well-established ways of doing these things on a regular server setup. With Docker, we now have the challenging task of moving all these processes into a micro-services architecture. At the same time, we've got new opportunities that Docker and the container approach presents. Fortunately, many of these things have already been done for us, in some form or other, so the challenge is often to choose the existing solution or class of solution that is most appropriate to our needs.
I prefer to use the term loosely as an ill-defined vision in the metaphorical sense I used it first.
For my current solution, I'm using Docker Swarm and not Kubernetes, but that might change in the future. Since that choice is contentious, and in case you care, here are my main reasons:
1. Docker Swarm is built into Docker
2. It does what I need it to do
3. It uses the same file definition structure (the compose file) that is used by docker-compose.
By avoiding Kubernetes, the environment in which a production site runs is reasonably reproduceable on a local development machine using docker-compose, which is a nice bonus.
Here's a more technical description of what real-world orchestration requires:
1. Managing persistent data.
Recall that containers are ephemeral. That means, you should always run the thought experiment that they could vanish at any point, so you need to have a system that can save the state of a site and make use of that state to restore a site, with a possibly altered codebase.
For Docker that means you need to use volume containers and you need to decide where the filesystem for those containers lives and how they are mounted into the codebase. There's definitely more than one way to do this. For example is the Drupal code base part of your image, or your persistent filesystem?
2. Launching and maintaining the containers into the collection of micro services that make up the application.
Which images do you use for each microservice? How are they networked together? Where/how do the persistent volumes get attached?
Docker provides the docker-compose file format as a nice to way to describe one answer to these questions. Of course, each site is different - e.g. the name of the database, civi/php versions, contributed modules, etc. Some of those differences are relevant to Docker and some aren't. For example, the php version has to be baked into the image. The Drupal/Civi versions can be baked in, but don't have to be, and how you maintain those versions will depend on whether you did.
3. Routing
On a really simple setup, you can tell Docker to map the host server's port 80 to a specific site container's port 80. Of course, that doesn't work if you want more than one site per server. So how do you get the right container answering a particular url request, based on the domain of the request?
Docker Swarm can provide part of an answer here - you can map each "website" to a specific port and then you don't have to keep track of where each individual container's ip is. In fact, that's the key step from docker-compose to Swarm, you've abstracted the "service" concept away from the individual container that provides the service into a thingy that maps a designated port to an available container on an arbitrary collection of host servers, allowing you to scale your service horizontally.
One standard tool for doing the routing is a container called "Traefik", but in a subsequent post, I'm going to provide a method using one of my favourite tools called "Varnish".
So, this is how I describe the problem, and I'll use the Simuliidae project to describe the answers that are working for me.
The other essential microservice for a Drupal + CiviCRM website is a database, and fortunately, that's reasonably standard. Here's a project that minimally tweaks the canonical Mariadb container by adding some small configuration bits: https://github.com/BlackflySolutions/mariadb
That leaves us now with the problem of "orchestration", i.e. how would you launch a collection of such containers that would serve a bunch of Drupal + CiviCRM sites. More interestingly, can we serve them in the real world, over time, in a way that is sustainable? i.e. handle code updates, OS updates, backups, monitoring, etc? Not to mention the various crons that need to run, and how about things like CiviMail and while we're at it the ability to migrate in existing sites, and maybe even auto-generate new sites? And can we give our clients some kind of access to their site other than via Drupal? e.g. phpmyadmin, and/or sftp? And what about developer access and workflow?
All of those things are possible, and there are generally well-established ways of doing these things on a regular server setup. With Docker, we now have the challenging task of moving all these processes into a micro-services architecture. At the same time, we've got new opportunities that Docker and the container approach presents. Fortunately, many of these things have already been done for us, in some form or other, so the challenge is often to choose the existing solution or class of solution that is most appropriate to our needs.
What is Orchestration?
The word "orchestration" is excellent because it conjures up the image of a diversity of instruments all working together to create a coherent whole. It can also a bit misleading in that it suggests that you're looking for a single technology that will accomplish this, and orchestration is often used in this way, as if it's a well-defined problem for which you have to pick, say Docker Swarm or Kubernetes.I prefer to use the term loosely as an ill-defined vision in the metaphorical sense I used it first.
For my current solution, I'm using Docker Swarm and not Kubernetes, but that might change in the future. Since that choice is contentious, and in case you care, here are my main reasons:
1. Docker Swarm is built into Docker
2. It does what I need it to do
3. It uses the same file definition structure (the compose file) that is used by docker-compose.
By avoiding Kubernetes, the environment in which a production site runs is reasonably reproduceable on a local development machine using docker-compose, which is a nice bonus.
Here's a more technical description of what real-world orchestration requires:
1. Managing persistent data.
Recall that containers are ephemeral. That means, you should always run the thought experiment that they could vanish at any point, so you need to have a system that can save the state of a site and make use of that state to restore a site, with a possibly altered codebase.
For Docker that means you need to use volume containers and you need to decide where the filesystem for those containers lives and how they are mounted into the codebase. There's definitely more than one way to do this. For example is the Drupal code base part of your image, or your persistent filesystem?
2. Launching and maintaining the containers into the collection of micro services that make up the application.
Which images do you use for each microservice? How are they networked together? Where/how do the persistent volumes get attached?
Docker provides the docker-compose file format as a nice to way to describe one answer to these questions. Of course, each site is different - e.g. the name of the database, civi/php versions, contributed modules, etc. Some of those differences are relevant to Docker and some aren't. For example, the php version has to be baked into the image. The Drupal/Civi versions can be baked in, but don't have to be, and how you maintain those versions will depend on whether you did.
3. Routing
On a really simple setup, you can tell Docker to map the host server's port 80 to a specific site container's port 80. Of course, that doesn't work if you want more than one site per server. So how do you get the right container answering a particular url request, based on the domain of the request?
Docker Swarm can provide part of an answer here - you can map each "website" to a specific port and then you don't have to keep track of where each individual container's ip is. In fact, that's the key step from docker-compose to Swarm, you've abstracted the "service" concept away from the individual container that provides the service into a thingy that maps a designated port to an available container on an arbitrary collection of host servers, allowing you to scale your service horizontally.
One standard tool for doing the routing is a container called "Traefik", but in a subsequent post, I'm going to provide a method using one of my favourite tools called "Varnish".
So, this is how I describe the problem, and I'll use the Simuliidae project to describe the answers that are working for me.