WPMU and APC Error

One of the worst types of errors to track down, I think, is one that happens in a blue moon and doesn’t seem to happen in response to anything particular. Here, I’ve been trying to hunt down why, every month or two, WordPress starts serving nothing but blank pages, yet not logging any errors, and why restarting Apache fixes the problem.

The main page isn’t affected, since it doesn’t use WordPress or APC (it’s some custom code I wrote that goes right to the database), but every other page on the site comes up blank. Server-wise, everything is fine: no parameters at all are different. The load’s low, nothing’s been changed, memory usage is fine, and so on. The logs seem to suggest that pages are being served just fine.

I just made a small bit of progress: logging into the APC (Alternative PHP Cache, not the UPS company) console and flushing the opcode cache fixes the problem. I always had a hunch it involved a cache somewhere, but I looked more at WP Super Cache than at APC. I still haven’t solved the problem, but now I know where to look.

Checklists

I came across this article in the New Yorker’s Annals of Medicine column, and found the conclusion to be pretty amazing. The article reminds me a lot of a Malcolm Gladwell piece, in that it’s a bunch of fascinating statistics about a subject that most writers would struggle to make sound interesting, supported by a handful of equally-interesting stories.

It makes the case for checklists in hospitals. I initially assumed that it was like most medical terms, in that the name intuitively elicits imagery of something entirely unrelated to what the term actually describes. But that’s not the case here. It talks about having a five-item checklist for the steps to take while installing a “central line” on a patient, something that’s a cakewalk for trained surgeons. Many doctors protested, finding it demeaning. But the results?

The results were so dramatic that they weren’t sure whether to believe them: the ten-day line-infection rate went from eleven per cent to zero. So they followed patients for fifteen more months. Only two line infections occurred during the entire period. They calculated that, in this one hospital, the checklist had prevented forty-three infections and eight deaths, and saved two million dollars in costs.

I haven’t stolen the article’s thunder, either: it gets better. Maybe it’s just for people like me, who find boring things fascinating, but I can’t help but be struck by how something so ridiculously simple that doctors are offended by it can save so many lives and so many millions in costs.

By the way, this all comes from a silly Ask MetaFilter thread, What are innovative ideas for healthcare to save money, increase efficiency and improve outcomes?

Bad Blogs

When I started blogging many years ago, I started by posting mundane crap. “Today was boring and I had a quiz in Algebra.” I soon realized something, though: no one cares. Heck, even I didn’t care. Yet I still find blogs that are personal journals on the Web, making no effort to be interesting, or even explain who Sally and Freddy are when discussing the drama between them at lunch.

But what I find myself noticing increasingly often is people who share their opinions on evertyhing. I haven’t been shy about my opinion on who was more qualified to govern us, or on what websites are worth visiting. But how about the lady who just had octuplets on top of already having six kids? Yeah, she sounds crazy. But who cares about my opinion about some lady having babies? If she can raise them, great. If not, well, shame on her, but that holds true for any person having any quantity of babies, anywhere. Does anyone care about my opinion on that? For that matter, isn’t it creepy for me to have an opinion about the parenting prowess of a women I’ve never met, and whose name I don’t know, or even care to know?

STORIES BY DAVE SECRETARY

I’m really at a loss to describe the what or how, or the why they’re funny, but these stories by davesecretary are oddly hilarious. The all caps is annoying for a minute, but then you start to get used to it, and it becomes just another entertaining aspect of the stories. I happen to think the first few are so-so, but they quickly get better. I guess you have to be in the right mood to appreciate them.

Maybe not quite as innately hilarious as David Sedaris’ Six to Eight Black Men story, but still, an entertaining read if you’ve got some time to kill.

ISPs and Mirrors

Here’s something I’ve never understood… Why don’t ISPs run mirrors of popular things for their clients? I’m having Debian update its package info, and it’s taking a while because it’s seemingly using a crappy mirror. I can customize it–and will do so later–but I’m left wondering…

From my perspective, it’d be great, because it’d be closer to me and presumably faster. But from my ISP’s perspective, it’d be an even bigger win, because it would be bandwidth that never left their network, lowering their bandwidth bills. I’m sure that work with various Linux distros doesn’t account for that much of Comcast’s (or any other ISP’s) bandwidth, but I’m equally as sure that I’m not the only Comcast user that ran an “apt-get update” today on Debian Lenny. (In fact, Debian and Ubuntu desktops both go out daily to automatically update package listings?)

And it’s not like it’s huge overhead, either. Set up a single server with a few hundred gig of disks, and it’ll merrily keep everything up to date on its own via rsync. Put one in each of your major POPs and you’re done. Maybe $10,000 invested total.

As long as you’re at it, set them up with ntp. It’s always seemed like something an ISP should do. Those are much more accurate if they’re closer. (Northeast Comcast users should note that Comcast appears to peer with MIT; MIT’s public bonehed.lcs.mit.edu is about 8ms away from me.)

Cruft

I’m a pretty firm believer that, after several years, computers build up enough cruft that you need to start from scratch. With Windows the machine has gotten unbearably slow and there’s never enough disk space. With Linux, you’re running something really old, or just itching to try something new. With any OS, you’ve got a bunch of strange problems that have come up and you’ve just come to accept as normal.

These new installs tend to be great excuses for getting new hardware, too. Might as well hold off for a new hard drive if you’re short on disk space, and that way you don’t have to wipe anything. And you might as well upgrade to 4GB of RAM before you install a new OS.

I’m at a crossroads, though. I live in a UNIX world. I work on Linux at work all day. I come home and my computer runs Linux. I work on my website, running on a Linux machine. I love Linux, but it’s a little bit of a love-hate relationship. It has a few quirks that get under my skin. But going back to Windows would make no sense for me. I suppose I could maintain Linux boxes from a Windows machine, although it’s silly. But I’m way more comfortable in Linux these days anyway.

A lot of my coworkers run Macs, and it’s something I’ve been tempted by for a long time. It’s stable, slick, and it’s based on BSD. What’s not to like? Well, what’s not to like is the price. “OSx86” solves this (although it’s effectively software piracy), but sources say that it’s kind of like trying to install Linux 10 years ago: you’d better know every piece of hardware in your machine, and be comfortable finding and installing drivers for it.

My other battle is whether I want a laptop or a desktop. I love my Thinkpad, but I want a much bigger drive (RAID, really), and a 14.1″ LCD is comically small. And I’d like to start doing more with virtual machines, but this would require some more RAM.

I played with pricing. I can build a quad-core system with a few big SATA disks and onboard RAID (I think RAID 5 is even an option), 8 GB RAM, and a 22″ LCD for under a thousand dollars. That’s less than the cost of an entry-level Mac. (Excluding the Mini…)

But at the same time, I found a few potentially great upgrades to my laptop. A 128 GB SATA disk can be had for $200-300, and it’d cost about $50 to upgrade my Thinkpad to 4GB RAM. (From 2GB.) The jury’s still out on whether a Thinkpad T60 will see the full 4GB; some reports say that something in the BIOS or part of the chipset won’t go past 3, but others seem to suggest that the people saying that are just running OSs that can’t see 4GB on a 32-bit box.

So I’m more confused then ever about what I want. Should I upgrade my laptop or buy a new desktop? And, whatever I do, what is it going to run?

MySQL Replication Lessons Learned

A couple things I ran into today that I want to keep searchable here in case I run into them again, and that I figured might be useful to someone else someday:

Let’s say that you take down a MySQL server that’s a replicated slave to do a memory upgrade, and it takes a really long time to shut down, and then you find that the machine doesn’t like the “new” DIMMs, so you throw the old ones back in and power it up. Just hypothetically. You then restart MySQL and issue the START SLAVE command, but it dies with an error:

090127 14:53:17 [ERROR] Failed to open the relay log './mysqld-relay-bin.000023' (relay_log_pos 23726)
090127 14:53:17 [ERROR] Could not find target log during relay log initialization

The relay log and position were both wholly wrong. I poked around, and found a lot of people who ran into this; it seems to be a data corruption issue, but also happens occasionally on a reboot. There’s a bunch of suggested fixes out there that don’t actually work. One thing that does work, though, is deleting the relay logs on the slave. (Any time someone on the Internet tells you to delete a file, you should, of course, think “move to another directory” instead so you can undo it if need be.) Once I deleted the relay logs, it started right up.

Lesson #2? Now you’re about 6,000 seconds behind the master, and the replication lag counter is going down at a rate of about 1 second per second. You can wait a couple hours. That seems pretty pathetic, though.

If your to-do list reads, “1.) Get slave running, and then 2.) Fine-tune my.cnf, currently stolen from another machine,” there’s a chance that you have sync_binlogs=1 set. This is bad for two reasons: the first is simply because of what it’s designed to do: flush the binlog to disk on every write. This is very safe, but also very slow. But the second reason it’s bad is that it’s apparently especially bad on ext3, so it’s doubly important to not use this option, at least not when write performance is important.

Mastering Technology

If you ever read a technical discussion board, you’ll quickly come to the realization that the breakdown of people is maybe 90% people who have kind of figured out how to use the technology, 7% people who are power users, and 3% people who are experts. It’s an arbitrary breakdown, but it seems about right intuitively.

Consider something like Excel. Most people can use it to keep tabular data, and most of those even know how to calculate sums. But very few might have a clue how to use Pivot Tables or formulas spanning multiple sheets, and even fewer will know how to extend it.

MySQL is definitely the same way. A lot of MySQL users can install it on their server and make phpBB use it. They might not understand what MyISAM and InnoDB are or how they’re different, much less the pros and cons of each. And even fewer could make a halfway decent DBA. But the good news with MySQL is that some of that elite 3% of experts are very, very vocal, and doing really, really neat things. Jeremy Zawodny is the first name that comes to mind, and check out his The New MySQL Landscape post. And don’t miss Percona’s announcement of their GPL’ed XtraDB, a replacement for InnoDB that’s supposed to be optimized for performance on more powerful machines. Seems like it’s very new and meant for MySQL 5.1, which some pretty smart people have said isn’t ready for prime-time. One of the MySQL guys at Google has a post about his patches to make MySQL better scale to ‘big iron’ type systems, too. And then there’s Our Delta (found on the Jeremy Zawodny blog) which distributes various patched versions of MySQL. Some are especially intersting to me, like Fast Master Promotion which is designed to allow a slave MySQL box to be promoted to master pretty much instantly, or the KILL IF IDLE command, allowing you to issue KILL statements to a connection and have them not affect non-idle connections. UserStats would be really helpful to run on a development machine to see what your code is impacting.

Indiana Jones

People who know me well will know that I’m not generally fond of movies. I can think of maybe a half-dozen movies I’ve seen lately. Borat is the only one I can think of that I’d recommend. Most movies just turn out to be a waste of my time. So I’m maybe not the normal movie-watcher.

That said, I’d like to review the latest Indiana Jones movie, which I watched last night. I’ll keep it nice and brief: F Minus. Made no sense.

I’m not sure how it started, but I had recently read something about greywater recyling, re-using water from things like your washing machine (as opposed to something gross, like your toilet) for other uses. It’s sometimes done in small homes, diverting the drain from your shower into your garden or the like, but it’s also done in big commercial places in the desert, where it’s filtered more heavily and reused.

At times, Indiana Jones got boring enough that I pulled out my iPhone and started reading some pages on greywater recycling. Did you know that you shouldn’t store greywater for more than 24 hours? As the action picked up a little more, I’d put it away, only to become seriously bored again and turn back to reading about recycling greywater on Wikipedia.

I don’t recall the last Indiana Jones movie I saw, but I seem to recall him as being a sort of macho, wild west hero who rides horses, kills badguys with his six-shooter and a whip, and defends ancient historical sites. Good ol’ Americana that makes sense, albeit being totally unrealistic. (He must have had about 10,000 bullets fired at him, and not a single one hit him: no special skill was involved, he was just running away and somehow machine gun fire from many guns never, ever made contact with him.)

But this one ended with a magnetized quartz (huh?) skull turning into an alien, which formed a giant UFO-vortex that sucked up the evil Russian lady, and turned what looked a lot like Machu Pichu into an ocean. And then, the end.

If anyone’s thinking of seeing it, I’d recommend you instead stay home. Here is the Wikipedia page on greywater, which includes some good links. Sure, I can think of much more interesting things to do than read up on greywater recycling. But watching this Indiana Jones movie isn’t one of them.

Sure, it had a few scenes that may have beat greywater recycling, but on average, it was slightly less interesting than reading about greywater recycling and how the various plumbing codes in the US regulate it. (Spoiler alert: some plumbing codes permit it, some do not, and most allow it with heavy regulation that usually makes it neither cost-effective nor environmentally friendly.) But besides being slightly more interesting, greywater has the advantage of making sense. Halfway through reading about greywater, it’s not suddenly going to become a magnetic skull made out of quartz, and it definitely won’t spontaneously turn into an alien, form a giant vortex, and flood Machu Pichu, sucking the evil Russians “into another dimension, the space in between spaces.”

The trouble with SPF

Unlike SAV (also known as challenge-response systems), SPF is generally a decent idea. Basically, you publish a DNS record for your domain that lists what IPs are allowed to send mail from your domain. This means that you can say that mail sent from the host ‘mail.yourdomain.com’ is valid, but if a spammer sends mail from a random hijacked box in Tijuana, it will be rejected via SPF. It doesn’t target spam directly, but rather, it targets spam that spoofs the domain. (Which is probably a very good percentage of spam.)

But I’ve recently noticed a problem I hadn’t considered before: forwarders. I can easily set up e-mail addresses on my n1zyy.com domain that will simply point elsewhere. So mail sent to helen@n1zyy.com (which is actually a spamtrap; don’t e-mail it) might just be automatically redirected to another e-mail address, say john.doe@example.com. The headers are rewritten so that the whole thing is transparent.

The problem is that, with SPF, the mailserver that redirects the mail is effectively “forging” the headers, which means that SPF will block it. If example@hotmail.com sends an e-mail to helen@n1zyy.com, and it gets redirected to john.doe@example.com, it will fail if Hotmail has an SPF record. This is because example.com gets mail saying it’s from hotmail.com, but the headers indicate that it was actually sent from n1zyy.com.

There’s a few workarounds, but most are sustainable:

  • The person running the original domain could add an SPF record for the mailserver doing the forwarding. This is all well and good if you’re sending mail from n1zyy.com and wanted me to whitelist the Comcast mailserver or something, but consider the example I used, in which case you’d have to call up Hotmail and ask them to add mail.n1zyy.com’s IP to their SPF record. They’d laugh at you.
  • The recipient mailserver could override it. You could tell your example.com mailserver that, if the header says n1zyy.com, you shouldn’t check the SPF record. Again, good luck with this, unless you run the mailserver. Also, this is getting into “I could probably hack the Postfix source code to do that…” material.
  • The recipient mailserver could be configured so that SPF will check to see if any of the mailservers along the way are listed in the SPF record, and, if so, accept the mail. This sounds like a good idea to me, to be honest, but it’s deviating a bit from what SPF was meant to do.
  • The originating mailserver could stop using SPF, and this problem would go away. But then someone would send out a hundred million spam e-mails claiming to be from that domain, and they’d all go through.

Clearly, this is the type of thing that everyone is thinking about on a Friday night.