I've been using Varnish with my Drupal sites for quite a long while (as a replacement to the Drupal anonymous page cache). I've just started using Drupal 8 and naturally want to use Varnish for those sites as well. If you've been using Varnish with Drupal in the past, you've already wrapped your head around the complexities of front-end anonymous page caching, presumably, and you know that the varnish module was responsible for translating/passing the Drupal page cache-clear requests to Varnish - explicitly from the performance page, but also as a side effect of editing nodes, etc.
But if you've been paying attention to Drupal 8, you'll know that it's much smarter about cache clearing. Rather than relying on explicit calls to clear specific or all cached pages, it uses 'cache tags' which require another layer of abstraction in your brain to understand.
Specifically, the previous mechanism in Drupal 7 and earlier was by design 'conservative' in the sense that many changes to the site caused way more cache clearing than was necessary, just because it had to be sure. So for example, if you edited a node, it would clear the whole site's page cache, because it had no idea which urls that piece of content might show up on (e.g. you might have a 'New content' block on every page that changes with any node edit).
The cache-tags solve this in a 'declarative' way by explicitly declaring dependencies when each page is built by Drupal. So if a page is built and it uses node 5 somewhere in building it, then there's a cache tag attached to the page that says 'node-5'.
With this in place, you can now see that Drupal can be much more clever about clearing caches - e.g. if you edit node 5, you can just clear the pages that have the node-5 cache-tag on it.
Nice idea, and of course, since cache-invalidation is one of those hard things in computing, it's not as easy as it sounds.
But if you're still game, here are a few useful bits I gleaned along the way:
1. The old Drupal varnish module communicated with varnish using the varnish admin port. The New and Better way to talk to varnish is via port 80, using special http methods PURGE and BAN. This means your vcl needs a bit of work (to restrict access, and to do the right thing when receiving these requests, recipe below). You really need to understand the difference between PURGE and BAN - it's the BAN that you're going to use, and the names change in the most recent varnish versions.
2. There is no varnish module for D8 (yet?). There is a 'purge' module and a 'varnish purge' module that together can clear the Varnish page cache correctly. And they use the varnish BAN method, not the PURGE method. Got it?
3. The basic idea is that those cache-tags get attached as http headers to the page, which is then passed to Varnish to deliver. Along the way, Varnish caches the page and then strips off the header before sending it off. So now all those cache tags known by Varnish. The cache tags don't get attached as http headers by default, that needs to be explicitly configured.
4. Now the cache clear request from Drupal gets converted into a BAN request with the appropriate headers, using the purge + varnish_purge modules as per the official recipe or here's a more useful, detailed recipe. But both are confused about the name the cache tag http header, which is currently set to just Cache-Tags (not Purge-Cache-Tags, not X-Cache-Tags).
6. None of that did anything for me and I thought I was still confused (well, yes, I was) until I discovered that the purge module queues up the request but doesn't actually do anything with those requests by default. In the Purge queue, you can actually look in the database of requests, where I saw all my purge requests (yeah, which should get translated to Varnish BAN requests). You can configure how those get triggered, the simple one is as a cron, but that's not going to be super great for many sites. I haven't figured out what the right mechanism might be, but I guess a custom cron every 5 minutes that only does this one cron task would be okay.
So, a lot of new things to learn.
But if you've been paying attention to Drupal 8, you'll know that it's much smarter about cache clearing. Rather than relying on explicit calls to clear specific or all cached pages, it uses 'cache tags' which require another layer of abstraction in your brain to understand.
Specifically, the previous mechanism in Drupal 7 and earlier was by design 'conservative' in the sense that many changes to the site caused way more cache clearing than was necessary, just because it had to be sure. So for example, if you edited a node, it would clear the whole site's page cache, because it had no idea which urls that piece of content might show up on (e.g. you might have a 'New content' block on every page that changes with any node edit).
The cache-tags solve this in a 'declarative' way by explicitly declaring dependencies when each page is built by Drupal. So if a page is built and it uses node 5 somewhere in building it, then there's a cache tag attached to the page that says 'node-5'.
With this in place, you can now see that Drupal can be much more clever about clearing caches - e.g. if you edit node 5, you can just clear the pages that have the node-5 cache-tag on it.
Nice idea, and of course, since cache-invalidation is one of those hard things in computing, it's not as easy as it sounds.
But if you're still game, here are a few useful bits I gleaned along the way:
1. The old Drupal varnish module communicated with varnish using the varnish admin port. The New and Better way to talk to varnish is via port 80, using special http methods PURGE and BAN. This means your vcl needs a bit of work (to restrict access, and to do the right thing when receiving these requests, recipe below). You really need to understand the difference between PURGE and BAN - it's the BAN that you're going to use, and the names change in the most recent varnish versions.
3. The basic idea is that those cache-tags get attached as http headers to the page, which is then passed to Varnish to deliver. Along the way, Varnish caches the page and then strips off the header before sending it off. So now all those cache tags known by Varnish. The cache tags don't get attached as http headers by default, that needs to be explicitly configured.
4. Now the cache clear request from Drupal gets converted into a BAN request with the appropriate headers, using the purge + varnish_purge modules as per the official recipe or here's a more useful, detailed recipe. But both are confused about the name the cache tag http header, which is currently set to just Cache-Tags (not Purge-Cache-Tags, not X-Cache-Tags).
6. None of that did anything for me and I thought I was still confused (well, yes, I was) until I discovered that the purge module queues up the request but doesn't actually do anything with those requests by default. In the Purge queue, you can actually look in the database of requests, where I saw all my purge requests (yeah, which should get translated to Varnish BAN requests). You can configure how those get triggered, the simple one is as a cron, but that's not going to be super great for many sites. I haven't figured out what the right mechanism might be, but I guess a custom cron every 5 minutes that only does this one cron task would be okay.
So, a lot of new things to learn.