Skip to main content

Varnish and Drupal

Varnish is front-end web proxy that I've been using with my Drupal sites for 18 months now. I talked about it in a presentation at Drupal camp with Khalid.

A front-end web proxy means that instead of visitors directly accessing the Drupal site, they go through a caching layer. I like to think of varnish as a protective bubble around Drupal.

Using Varnish has a number of benefits, and in my presentation I was particularly interested in demonstrating how it enables a site to survive huge spikes of sudden traffic, beyond what a web server would normally be able to handle. So that's one of the benefits.

Another important benefit is to reduce the load on the server, enabling better response. Or more cynically, it would enable you to pile more sites onto your server. But to explore this more closely: by moving most of your 'easy' page loads out of the php/apache stack, you can tune your stack better to what it's got to do - and you're not wasting 128Mb or more of RAM when serving an image, which is what happens by default on a server with apache and mod_php.

But even if you've got a pretty ordinary site and no server load issues, here's a reason why Varnish benefits you: your visitors' experience of speed. The key is that the traffic that is most sensitive to the load speed of your site is anonymous traffic to your front page. That's because new visitors (who usually end up first on your front page) are the ones you have to engage immediately or they go away (technically known as a "bounce"). A typical site might have a bounce rate of 50%, so any slowness in loading that front page by anonymous visitors is going to make a difference to whether that visitor continues to explore the site, or just leaves out of the tedium of waiting (who wants to wade through a slow loading site?). And here's why Varnish is the perfect solution: it does its best caching of your most visited pages to anonymous visitors!

And finally, some numbers. I was curious just how much traffic varnish might divert from a 'regular' drupal site, and haven't found anybody else's numbers.  Here are my last 2 months on one of my servers. The short answer is: about 30% of the visits go entirely to Varnish, more than half the hits (mainly because varnish caches images/css/js files for up to 2 weeks, even for logged-in users), and about half the total bandwidth.

Anybody got any comparable statistics?


MonthUnique visitorsNumber of visitsPagesHits
Feb 2012471448530546285476603111.79 GB
Mar 2012399636941748503083841612.76 GB


MonthUnique visitorsNumber of visitsPagesHits
Feb 201260082118672449906226430724.89 GB
Mar 20125171397369452394211089024.44 GB

Popular posts from this blog

Me and varnish win against a DDOS attack.

This past month one of my servers experienced her first DDOS - a distributed denial of service attack. A denial of service attack (or DOS) just means an attempt to shut down an internet-based service by overwhelming it with requests. A simple DOS attack is usually relatively easy to deal with using the standard linux firewall called iptables.  The way iptables works is by filtering the traffic based on the incoming request source (i.e., the IP of the attacking machine). The attacking machine's IP can be added into your custom ip tables 'blacklist' to block all traffic from it, and it's quite scalable so the only thing that can be overwhelmed is your actual internet connection, which is hard to do.

The reason a distributed DOS is harder is because the attack is distributed from multiple machines. I first noticed an increase in my traffic about a day after it had started - it wasn't slowing down my machine, but it did show up as a spike in traffic. I quickly saw that…

CiviCRM's invoice_id field and why you should love the hash

I've been banging my head against a distracted cabal of developers who seem to think that a particular CiviCRM core design, which I'm invested in via my contributed code, is bad, and that it's okay to break it.

This post is my attempt to explain why it was a good idea in the first place.

The design in question is the use of a hash function to populate a field called 'invoice_id' in CiviCRM's contribution table. The complaint was that this string is illegible to humans, and not necessary. So a few years ago some code was added to core, that ignores the current value of invoice_id and will overwrite it, when a human-readable invoice is generated.

The complaint about human-readability of course is valid, and the label on the field is misleading, but the solution is terrible for several reasons I've already written about.

In this post, I'd like to explain why the use of the hash value in the invoice_id field is actually a brilliant idea and should be embrac…

Upgrading to Drupal 8 with Varnish, Time to Upgrade your Mental Model as well

I've been using Varnish with my Drupal sites for quite a long while (as a replacement to the Drupal anonymous page cache). I've just started using Drupal 8 and naturally want to use Varnish for those sites as well. If you've been using Varnish with Drupal in the past, you've already wrapped your head around the complexities of front-end anonymous page caching, presumably, and you know that the varnish module was responsible for translating/passing the Drupal page cache-clear requests to Varnish - explicitly from the performance page, but also as a side effect of editing nodes, etc.

But if you've been paying attention to Drupal 8, you'll know that it's much smarter about cache clearing. Rather than relying on explicit calls to clear specific or all cached pages, it uses 'cache tags' which require another layer of abstraction in your brain to understand.

Specifically, the previous mechanism in Drupal 7 and earlier was by design 'conservative' …