Friday, July 04, 2008

Infrastructure projects

I've been running my own server for a year and a half now, and have been surprised at how trouble free it's been. I attribute this to:

  1. luck
  2. good planning
  3. a decent upstream provider
  4. the maturity of linux distribution maintenance tools (e.g. yum)

In this case, good planning means:

  1. keeping it as simple as possible
  2. doing things one at a time
  3. i'm the only one mucking about on it
And so this month, inspired by some Drupal camp sessions, I decided to take some time to make a good thing better. My goals were:
  1. Optimizing my web servicing for more traffic.
  2. Simplifying my Drupal maintenance.
  3. Automating my backups.

And here's the results ...

Web Servicing Optimizations

This was relatively easy - I just finished off the work from here:

Specifically, i discovered that I hadn't actually setup a mysql query cache, so I did that. And then I discovered that it was pretty easy and not dangerous to remove a bunch of the default apache modules. All I had to do was comment out the lines of the httpd.conf file. I took out some other gunk in there that isn't useful for Drupal sites (multilingual icons, auto indexing).

I like to think that between those two, the response time is even better, though the difference is relatively marginal without much load. The real reason to do this is to increase the number of available servers in apache without the risk of going into swap death. So I can now add more sites with out fear.

Simplifying Drupal Maintenance

I was converted to SVN (a version control program) 3 years ago and still love it. I've been using it to methodically track all the code, with individual repositories for each of my major projects, using the full trunk and vendor branch setup and the magic of svn_load_dirs.

But after a project starts using a lot of contributed modules, or when there are several code security updates each year and you have several projects, this starts getting time consuming.

So I've started NOT putting drupal core or contributed modules into my svn projects, and I'm using one multi-site install for most of my sites. Along with the fabulous update_status module for Drupal 5 (which is in core for Drupal 6), keeping up-to-date is now much more manageable. It's also a change of mind set - I'm now more committed (pun intented) to the Drupal community. I means I can no longer hack core (at least not without a lot of work).

And so -- I also tested this whole scheme out by moving all my simple projects to a new document root that's controlled entirely via cvs to the server, with symlinks out to my individual site roots (which still go in svn, so i can keep track of themes, files and custom modules), and it worked well. There's actually a performance issue here as well - by keeping all my sites on the same document root, the php cache doesn't fill up so fast, because there's less code running. And it's more easily kept secured.

And as a final hurrah, I converted up to Drupal 6. In the process, I've given up on the 'links' module which I thought had some promise, and am now just using the 'link' module that defines link fields for cck. I also started learning about the famed Drupal 6 theming, and tweaked the theme for fun.


I backup to an offsite-server using rsync, which seems to be a common and highly efficient way to do things for a server like this. Rsync is clever to only send file diffs, so load and bandwidth are kept to a minimum. My backups are not for users, they're only for emergencies, so I don't need to do hourly snapshots, only daily rsyncs.

Well, this works well for code, but not so much for mysql. I'd been doing full mysqldumps, and then copying them to my backup server, but this was not very efficient. So finally this week, I've set it up with help from some simple scripts to use the --tab parameter to mysqldump - which dumps the tables in each database to separate files. This means that now when I run rsync on them, it's clever enough to only worry about the tables that have changed, which are relatively few each day. So now I've got daily mysql backups as well, without huge load/bandwidth!

And that also means, I can now use my backup as a place to pull copies of code and database when I want to setup a development environment.


Which takes me almost to a new topic, but it's also about infrastructure, so here it is. I've been running little development servers for several years. My main one I actually found being thrown out (it was a Pentium II). They have served me well, but I was rethinking my strategy mainly on power issues: I'm not happy that I have to use so much electricity for them (and as older servers, the power supplies aren't very efficient), and since one of them is actually in my office, it's fine in the winter when my office is cold, but really not good in the summer when I'm trying to stay cool.

And so the promise of virtualization lured me into believing I could run a little virtual server off my desktop. I tried XEN, but it broke my wireless card (because I have to run it using ndiswrapper), so I finally gave up and installed VMWare (because it was in an ubuntu-compatible repository), even though it's not really open source.

Does it work? Well, so far so good.