The Metablog

The blog about the blogs!

Improving WPMU

This page is still under revision and should be considered an incomplete draft.

This site is powered by WPMU, or WordPress Multi-User. I’m pretty much obsessive-compulsive, so I’ve done a lot of tweaking. There are two main areas where WordPress (both WPMU and mainstream WordPress) are sorely lacking. Fortunately, both are easy to fix. One thing I want to stress: it’s my opinion that WordPress, out of the box, is not production-ready!

Performance

Here’s why WordPress isn’t production-ready out of the box: the performance is abysmal. What’s worse is that it’ll work okay. Until you get a bunch of visitors, in which case the server will fail very, very early.

Andrew first pointed out the problem. It’s been verified across a couple different installations. Using Apache’s ab2 tool, to run 1,000 consecutive requests, we get the following:

Requests per second: 3.62 [#/sec] (mean)
Time per request: 276.348 [ms] (mean)
Time per request: 276.348 [ms] (mean, across all concurrent requests)
Transfer rate: 45.99 [Kbytes/sec] received

WordPress, out of the box, isn’t even able to handle 4 requests a seconds. (It falls apart even more if there are multiple, concurrent requests.) By comparison, requesting a static image returns over 1,000 ‘pages’ per second with ab2. 4 request per second is horrible. Fortunately, things can be much better without too much effort.

WP Super Cache

There’s a great plugin that will cache pages called WP Super Cache.

The more common plugin is called WP-Cache, but it doesn’t work nearly as well as WP Super Cache. I have a page with some benchmark data here. Between WP Super Cache and using PECL-APC (caching the ‘compiled’ version of PHP pages), I can now handle almost 600 pages a second. Best of all, WP Super Cache takes 5 minutes, at most, to get running, yet the results are astronomical.

MySQL Query Caching

Enabling query caching has not increased the performance of WordPress directly. There’s some overhead in enabling it. Most of the queries WordPress uses seem to want to include the current date and time, which means that they’re not really cacheable. However, I do have query caching turned on here since it comes in handy for some other code I run. Here’s a page on how to enable it.

Spam

The other big problem I have is with spam. Once your blog gets picked up by the search engines, you’ll get flooded with it. I have a few different weapons here.

Block Comments with no Referrer

Watching the logs when spammers post, I noticed that they issue a POST with their spam without ever doing a GET for the form! This is a nonsensical condition. I contemplated trying to write a plugin to log every GET of a form and check to see if POSTS were in the logs, but it’d be complicated and add a lot of overhead. It turns out there’s a simpler way.

The WordPress Codex site has a neat mod_rewrite rule. It intercepts requests for wp-comments-post.php and, if they don’t have an HTTP referrer of your domain, redirects their request to their own IP. Turning them on themselves isn’t strictly required, but it will probably throw them through a loop. The real point is that, by redirecting them to your own IP, you get them off of your site and prevent them from leaving spam!

Akismet

There’s also a spam plugin for WordPress called Akismet. It’s extremely effective, but there’s one problem that we WPMU users face: you need an API key from Wordpress.com, and that API key will only protect one blog on your site. While you can cheat in the source code, the Akismet developers have stated that this sticks out like a sore thumb in their logs and results in them banning you entirely for abusing the service. So I have my users register individually if they want to use the plugin. I protect some of the most spam-attracting blogs.

I also have some code to run a daily query on the database that selects all IPs that were flagged as having posted spam in the past 48 hours. The SQL looks something like this:

SELECT DISTINCT(comment_author_IP) FROM wp_1_comments WHERE comment_approved='spam' AND comment_date > DATE_SUB(NOW(), INTERVAL 2 DAY)

I then take these IPs and ban them using iptables. (I was formerly sticking them in /etc/hosts.deny, before realizing that Apache doesn’t ever use this file!)