Filesystems

On my continuing obsession with squeezing every bit of performance out of this system… They say that Linux filesystems don’t get fragmented. I never understood this. It’s apparently smarter about where files are placed. But still, frag-proof? If it was that easy, other filesystems would have figured it out long ago too. I figured that the explanation was just over my head. In reality, the “explanation” is that it’s a myth.

oxygen bin # fragck.pl /home 2.19458018658374% non contiguous files, 1.03385162150155 average fragments. oxygen bin # fragck.pl /var/log 56.3218390804598% non contiguous files, 28.9425287356322 average fragments. oxygen bin # fragck.pl /var/www/ 1.45061443222766% non contiguous files, 1.05527580153377 average fragments. oxygen bin # fragck.pl /etc 2.18023255813953% non contiguous files, 1.05450581395349 average fragments. oxygen bin # fragck.pl /var/lib/mysql/ 16.5424739195231% non contiguous files, 2.93740685543964 average fragments.

The results kind of make sense: /var/log is full of files where you’re constantly appending a line or two to various files, so it only stands to reason that, if the filesystem isn’t very careful, fragmentation would build up. The other one is /var/lib/mysql, where MySQL stores its data. It’s the same deal as /var/log, really, in that it’s continually adding files.

/var/log/messages, the system log file, is in 75 pieces. Its backup, messages.1.gz,was in 68.

Realistically the performance hit is negligible. It’s not like a core system file is in hundreds of pieces. (Like, say, the paging file!) /bin has very low fragmentation. Log files can be fragmented an not impact anything. (Except my OCD.) Although I am concerned about MySQL’s data stores building up fragmentation. In theory I can bring the database down and shuffle the files around, but it’s probably best left alone right now.

Fortunately, there’s hope… By moving a file to another partition, you cause it to move physical locations. Something like mv messages /tmp/ramdisk && mv /tmp/ramdisk/messages . will cause the file to be rewritten. (Granted, this particular command was an awful idea: syslog-ng keeps /var/log/messages open, and doesn’t like it when the file randomly disappears. The fact that it was only gone for a split-second doesn’t change the fact that the files location has changed.) Although don’t get too excited about this: for some reason, fragmentation sometimes ends up worse! access_log was in 60 pieces. Now it’s in 76.

I’ve also heard it said that some fragmentation isn’t necessarily a bad thing: a few files close together on the disk with light fragmentation is better than frag-free files on opposite ends of the disk. But that doesn’t satisfy my OCD. I guess the moral of the story is to not muck around too much with things. Or, “if it ain’t broke, don’t fix it!”

M/S Explorer Crashes Again

M/S Explorer has crashed.

For added irony, they were in penguin territory at the time.

(One wonders the view out the Windows now that the ship is on its side–they’re most likely blue screens! No Word on whether that is the case, of course, but I will say that the ship’s Outlook isn’t so good. Fortunately, because the rescuers Excel at what they do, passengers were able to Exchange their rooms for ones on a stable ship. Because there were no fatalities, this was not FrontPage news, except on Digg.)

Speeding up MySQL with tmpfs?

I’m still getting a decent percent of files being created on disk in queries, even though my tmp_table_size is an astonishing 128MB. (The whole blogs database uses about 6MB of disk.)

The problem is described here: TEXT and BLOB queries apparently don’t like being in memory. This page explains it further.

The problem is that… These are blogs. Aside from some trivial cross-table type stuff, every single query uses TEXT queries. Interestingly, the solution everyone proposes is using a ramdisk. I was somewhat concerned about using a ramdisk, though: for one, the procedure for creating it looked somewhat arcane, and one place talking about it mentioned that his 16MB of ramdisk was almost as big as his 20MB hard drive. I think of my old 20GB hard drive as ridiculously old. The other reason, though, is that ramdisk is scary: it’s a finite size. I’d love something like a 1GB ramdisk for /tmp, but I don’t even have a gig of RAM, much less a gig to allocate for file storage.

Enter tmpfs. In a nutshell, it’s like tmpfs, but the size can be dynamic, and it can swap, which means I don’t have to worry about my 16MB tmpfs partition trying to store 17MB of data and blowing up. Creation was eerily easy:

# Make a directory to use as a mountpoint, and give it a misleading name
mkdir /tmp/ramdisk

# Mount it as type tmpfs
mount -t tmpfs tmpfs /tmp/ramdisk -o size=16M,mode=1777

In my.cnf, it’s as easy as changing tmpdir=/tmp/ to tmpdir=/tmp/ramdisk/.

And now, we let it run for a while and see how performance feels.

DB Stats

I’ve been playing with phpMyAdmin and doing a bit of optimization of it. A few stats:

  • Since I upgraded the kernel, MySQL has been up for a little under 3 days and 11 hours.
  • The DB server has moved 841 MiB of traffic. This is 10 MiB an hour.
  • It’s processed 131,048 queries. This is about 1,580 an hour.
  • 132,000 inserted rows.
  • 96K queries served out of MySQL’s query cache.
  • 1,393 temporary tables created on disk to handle queries. This seems like a bottleneck, although it is only a tiny percentage.

I’ve just restarted MySQL to apply some configuration changes. (Actually, I could have changed them on the fly now that I think about it…) I tweaked the settings a bit: MySQL allows you to set limits on how much RAM it can use for various operations, and I tend to be very frugal. But I think I was shooting myself in the foot there: it was relying on disk a bit too much. It’s not like I’m running a load average of 25 and am moving gigs of traffic a day, where tuning is really vital, but it still bothers me that it’s not as efficient as it could be.

No, No You Don’t

I periodically peruse access_log and error_log, Apache’s logfiles. There’s always weird crap. Here’s todays (odd linebreaks to make it fit):

[Thu Nov 22 07:59:15 2007] [error] [client 80.32.3.251] File does not exist: /var/www/ardentdawn.org/htdocs/drupal, referer: http://72.36.178.236/drupal/?_menu[callbacks][1][callback]=drupal_eval &_menu[items][][type]=-1&-312030023=1
&q=1/<?passthru(%22echo%20IROCKTHEWORLD%22);

There’s also some inept script kiddie later on who tried requesting the same non-existent page 112 times in a row… o_O

Holidays

  • One of my classmates is from England. He was talking the other day about how, incredibly often, people here ask him if they celebrate the 4th of July in England.
  • Cinco de Mayo is not Mexico’s day of independence. It commemorates the date of a battle. More importantly, it’s just a minor regional holiday in Mexico. It’s nothing like our 4th of July. And Mexicans don’t make a big deal out of it! It’s arguably celebrated primarily in the United States.
  • Another foreign student at school mentioned that her home country (I think it’s Portugal) celebrates Thanksgiving. That’s really pretty strange?

Undoing bad tar files

Proper ‘etiquette’ for packaging a tar file is to package it so that it extracts to a directory. But sometimes, the files are packaged by an idiot, and, when extracted, just extract the files right to whatever directory it’s in. (Which is fine, if you expect it.)

tar takes a “t” option (“t” for “list”–get it? I don’t…) to list (list?) the files in a directory. You can use it two ways:

  • Pre-emptively: tar ft file.tar will show you how it’d extract.
  • Retroactively: rm `tar ft file.tar` will list the files, and pass them as an argument to rm, deleting the mess it just made.

Focus

The other day my camera was in its “AF Hunt” mode, where it couldn’t seem to lock on focus. It’d focus past where it should be, and then turn around and focus back the other way, and just keep going. When you use the flash, it’s worse, because it’ll do a strobe flash to try to aid in focus, but it doesn’t help at all.

After a couple times of doing this, I finally got it focused, and just slid the switch on the lens from “AF” to “M,” disengaging automatic focus. It’d hold the focus that way, so it wouldn’t have to focus every time. (I was stationary, photographing something stationary, so there was no need to refocus every time.)

And then I put the camera down and bumped the lens, so the focus was off. So I just turned the focusing ring. And for the rest of the night, I left the camera in manual focus mode. I’ve found that I can do it just as quickly as the camera can focus the lens, and that’s when it works right: I don’t spin the focus back and forth ten times in a vain attempt to focus something.

Leaving it in manual focus also speeds up the shot: you press the shutter and it takes the picture instantly. There’s no waiting as it focuses.

So almost accidentally, I’ve become a fan of manual focus. Sometimes I’m lazy and want the camera to do it for me, but more often than not, I’m finding that I’d just as soon do it myself.

Geekery

One of my weird OCD concerns is that some of the scripts I host place a heavy load on the server. I want to make sure that, in busy times, they don’t weigh down things further. Here’s a neat little bit of PHP I wrote to simply have PHP abort the page load if the 1-minute load average is over 2.00:

// Check the uptime first
$fh = fopen('/proc/loadavg', 'r');
$uptime = fread($fh, '4');
fclose($fh);

if ($uptime>2) {
die("Sorry, we're too busy.");
}

Rather than die(), you might throw a redirect to a cache or something else. And I should point out that, of course, running this code does take some CPU time… And that this script doesn’t always make sense: you’re basically forcing a failure before the server itself forces the failure. The time it makes sense is in the way I’m using it — when some unimportant, tangential project requires inordinate resources and you want to make sure it doesn’t slow the server down too excessively, at the expense of the more important projects (e.g., the blogs).