Skip to main content

Upgrading to Drupal 8 with Varnish, Time to Upgrade your Mental Model as well

I've been using Varnish with my Drupal sites for quite a long while (as a replacement to the Drupal anonymous page cache). I've just started using Drupal 8 and naturally want to use Varnish for those sites as well. If you've been using Varnish with Drupal in the past, you've already wrapped your head around the complexities of front-end anonymous page caching, presumably, and you know that the varnish module was responsible for translating/passing the Drupal page cache-clear requests to Varnish - explicitly from the performance page, but also as a side effect of editing nodes, etc.

But if you've been paying attention to Drupal 8, you'll know that it's much smarter about cache clearing. Rather than relying on explicit calls to clear specific or all cached pages, it uses 'cache tags' which require another layer of abstraction in your brain to understand.

Specifically, the previous mechanism in Drupal 7 and earlier was by design 'conservative' in the sense that many changes to the site caused way more cache clearing than was necessary, just because it had to be sure. So for example, if you edited a node, it would clear the whole site's page cache, because it had no idea which urls that piece of content might show up on (e.g. you might have a 'New content' block on every page that changes with any node edit).

The cache-tags solve this in a 'declarative' way by explicitly declaring dependencies when each page is built by Drupal. So if a page is built and it uses node 5 somewhere in building it, then there's a cache tag attached to the page that says 'node-5'.

With this in place, you can now see that Drupal can be much more clever about clearing caches - e.g. if you edit node 5, you can just clear the pages that have the node-5 cache-tag on it.

Nice idea, and of course, since cache-invalidation is one of those hard things in computing, it's not as easy as it sounds.

But if you're still game, here are a few useful bits I gleaned along the way:

1. The old Drupal varnish module communicated with varnish using the varnish admin port. The New and Better way to talk to varnish is via port 80, using special http methods PURGE and BAN. This means your vcl needs a bit of work (to restrict access, and to do the right thing when receiving these requests, recipe below). You really need to understand the difference between PURGE and BAN - it's the BAN that you're going to use, and the names change in the most recent varnish versions.

2. There is no varnish module for D8 (yet?). There is a 'purge' module and a 'varnish purge' module that together can clear the Varnish page cache correctly. And they use the varnish BAN method, not the PURGE method. Got it?

3. The basic idea is that those cache-tags get attached as http headers to the page, which is then passed to Varnish to deliver. Along the way, Varnish caches the page and then strips off the header before sending it off. So now all those cache tags known by Varnish. The cache tags don't get attached as http headers by default, that needs to be explicitly configured.

4. Now the cache clear request from Drupal gets converted into a BAN request with the appropriate headers, using the purge + varnish_purge modules as per the official recipe or here's a more useful, detailed recipe. But both are confused about the name the cache tag http header, which is currently set to just Cache-Tags (not Purge-Cache-Tags, not X-Cache-Tags).

6. None of that did anything for me and I thought I was still confused (well, yes, I was) until I discovered that the purge module queues up the request but doesn't actually do anything with those requests by default. In the Purge queue, you can actually look in the database of requests, where I saw all my purge requests (yeah, which should get translated to Varnish BAN requests). You can configure how those get triggered, the simple one is as a cron, but that's not going to be super great for many sites. I haven't figured out what the right mechanism might be, but I guess a custom cron every 5 minutes that only does this one cron task would be okay.

So, a lot of new things to learn.

Popular posts from this blog

The Tyee: Bricolage and Drupal Integration

The Tyee is a site I've been involved with since 2006 when I wrote the first, 4.7 version of a Drupal module to integrate Drupal content into a static site that was being generated from bricolage. About a year ago, I met with Dawn Buie and Phillip Smith and we mapped out a number of ways to improve the Drupal integration on the site, including upgrading the Drupal to version 5 from 4.7. Various parts of that grand plan have been slowly incorporated into the site, but as of next week, there'll be a big leap forward that coincides with a new design [implemented in Bricolage by David Wheeler who wrote and maintains Bricolage] as well as a new Drupal release of the Bricolage integration module . Plans Application integration is tricky, and my first time round had quite a few issues. Here's a list of the improvements in the latest version: File space separation. Before, Drupal was installed in the apache document root, which is where bricolage was publishing it's co...

A Strange Passion for Security

I'm not a computer security expert, but it's been part of my work for many years, in different forms.  A very long time ago, a friend hired me to write up a primer for internet security, and ever since then it's been a theme that's sat in the background and pops up every now and then . But lately, it's started to feel like more than a theme, and but indeed a passion. You may consider computer and internet security to be a dry subject, or maybe you imagine feelings of smugness or righteousness, but "passion" is the right word for what I'm feeling. Here's google's definition: Passion: 1. a strong and barely controllable emotion. 2. the suffering and death of Jesus. Okay, let's just go with number 1. for now. If you followed my link above to other posts about security, you'll notice one from eight years ago where I mused on the possibility of the discovery of a flaw in how https works. Weirdly enough, a flaw in https was discovered shortly...

Orchestrating Drupal + CiviCRM containers into a working site: describing the challenge

In my previous posts, I've provided my rationale for making use of Docker and the microservices model for a boutique-sized Drupal + CiviCRM hosting service. I've also described how to build and maintain images that could be used for the web server (micro) service part of such a service. The other essential microservice for a Drupal + CiviCRM website is a database, and fortunately, that's reasonably standard. Here's a project that minimally tweaks the canonical Mariadb container by adding some small configuration bits:  https://github.com/BlackflySolutions/mariadb That leaves us now with the problem of "orchestration", i.e. how would you launch a collection of such containers that would serve a bunch of Drupal + CiviCRM sites. More interestingly, can we serve them in the real world, over time, in a way that is sustainable? i.e. handle code updates, OS updates, backups, monitoring, etc? Not to mention the various crons that need to run, and how about things ...