Terrible Software

Two different things that boggled my mind today:

  • CCleaner offered clean up Symantec’s log files. All 5 gig of them. (?!?!)
  • Team Fortress 2 just crashed after spending about ten minutes “loading.” It complained that there wasn’t enough memory and that I probably had the paging file disabled. The latter is true: I never recreated it after disabling it since it was in 600 pieces. But RAM? I’ve got 2 GB of it. If you can’t write code to fit in that, you deserve to be stuck in a lift. A burning lift. With a corpse.

Seriously, 2 GB RAM isn’t enough to load the game? And you need 5 GB of log files?

MiniAjax — An Awesome Site

Web developers, check it out. My one complaint is that this is an awkward assortment of things ranging from little JavaScript snippets to free (GPL) apps to proprietary, expensive applications. But there are some very cool ones in there. (Psst! Heatmap is running on this site! It’s going to take a while to build up enough data worth sharing, but I’ll let you know when the time comes.) Some of the other ones are going to make their way into some projects I’m working on.

Malus Fide

I’ve always like the idea of rewarding douchebaggery with more douchebaggery. And one bit of douchebaggery that really bugs me is that, running a webserver, it’s always getting requests for pages that have never existed. What’s going on is that people are probing for common vulnerabilities. I don’t have a /phpmyadmin, but I get multiple requests a day for it. (I do have PHPMyAdmin, but it’s up to date, secure, and at an obscure URL.) Same goes for awstats.

What I’ve always wanted to do is respond to these requests with complete garbage. Unending garbage. My long-time dream was to link a page to /dev/random, a “file” in Linux that’s just, well, random. (It’s actually a device, a software random number generator.) The problem is that linking it is full of problems, and, when you finally get it working, you’ll realize that it’s smart enough to view it as a device and not a file.

So I took the lazy route and just created a 500MB file. You use dd to copy data from a disk, with /dev/urandom as the input and a file with a .html extension as output. I had it read 500 “blocks” of 1MB. Granted, this is a total waste of disk space, but right now I have some spare space.

Of course, I was left with a few resources I was concerned about: namely, RAM, CPU time, and network activity. I use thttpd for this idiotic venture, which lets me throttle network activity. I’ve got it at 16 KB/sec right now. (Which is an effective 128 kbps.) This ensures that if it gets hit a lot it won’t put me over my (1,000 GB!) bandwidth allocation.

Apparently, though, this throttling solves the problem: at first glance, it looks like it’s smart enough to just read 16KB chunks of the file and send them out, as opposed to trying to read it into memory, which would kill me on CPU time and RAM. So the net result is relatively minimal resource utilization.

Currently, it’s just sitting there at an obscure URL. But my eventual plan is to setup a /awstats and a /phpmyadmin and a /admin and a /drupal and have them all throw a redirect to this file.

The other bonus is that, at 16KB/sec, if a human gets there, they can just hit “stop” in their browser long before a crash is imminent. But, if it works as intended, infected systems looking to spread their worms/viruses won’t be smart enough to think, “This is complete gibberish and I’ve been downloading it for 30 minutes now” and will derail their attempts at propagating.

It’s not in motion yet, though… But I’ll keep you posted.

Filesystems

On my continuing obsession with squeezing every bit of performance out of this system… They say that Linux filesystems don’t get fragmented. I never understood this. It’s apparently smarter about where files are placed. But still, frag-proof? If it was that easy, other filesystems would have figured it out long ago too. I figured that the explanation was just over my head. In reality, the “explanation” is that it’s a myth.

oxygen bin # fragck.pl /home 2.19458018658374% non contiguous files, 1.03385162150155 average fragments. oxygen bin # fragck.pl /var/log 56.3218390804598% non contiguous files, 28.9425287356322 average fragments. oxygen bin # fragck.pl /var/www/ 1.45061443222766% non contiguous files, 1.05527580153377 average fragments. oxygen bin # fragck.pl /etc 2.18023255813953% non contiguous files, 1.05450581395349 average fragments. oxygen bin # fragck.pl /var/lib/mysql/ 16.5424739195231% non contiguous files, 2.93740685543964 average fragments.

The results kind of make sense: /var/log is full of files where you’re constantly appending a line or two to various files, so it only stands to reason that, if the filesystem isn’t very careful, fragmentation would build up. The other one is /var/lib/mysql, where MySQL stores its data. It’s the same deal as /var/log, really, in that it’s continually adding files.

/var/log/messages, the system log file, is in 75 pieces. Its backup, messages.1.gz,was in 68.

Realistically the performance hit is negligible. It’s not like a core system file is in hundreds of pieces. (Like, say, the paging file!) /bin has very low fragmentation. Log files can be fragmented an not impact anything. (Except my OCD.) Although I am concerned about MySQL’s data stores building up fragmentation. In theory I can bring the database down and shuffle the files around, but it’s probably best left alone right now.

Fortunately, there’s hope… By moving a file to another partition, you cause it to move physical locations. Something like mv messages /tmp/ramdisk && mv /tmp/ramdisk/messages . will cause the file to be rewritten. (Granted, this particular command was an awful idea: syslog-ng keeps /var/log/messages open, and doesn’t like it when the file randomly disappears. The fact that it was only gone for a split-second doesn’t change the fact that the files location has changed.) Although don’t get too excited about this: for some reason, fragmentation sometimes ends up worse! access_log was in 60 pieces. Now it’s in 76.

I’ve also heard it said that some fragmentation isn’t necessarily a bad thing: a few files close together on the disk with light fragmentation is better than frag-free files on opposite ends of the disk. But that doesn’t satisfy my OCD. I guess the moral of the story is to not muck around too much with things. Or, “if it ain’t broke, don’t fix it!”

Speeding up MySQL with tmpfs?

I’m still getting a decent percent of files being created on disk in queries, even though my tmp_table_size is an astonishing 128MB. (The whole blogs database uses about 6MB of disk.)

The problem is described here: TEXT and BLOB queries apparently don’t like being in memory. This page explains it further.

The problem is that… These are blogs. Aside from some trivial cross-table type stuff, every single query uses TEXT queries. Interestingly, the solution everyone proposes is using a ramdisk. I was somewhat concerned about using a ramdisk, though: for one, the procedure for creating it looked somewhat arcane, and one place talking about it mentioned that his 16MB of ramdisk was almost as big as his 20MB hard drive. I think of my old 20GB hard drive as ridiculously old. The other reason, though, is that ramdisk is scary: it’s a finite size. I’d love something like a 1GB ramdisk for /tmp, but I don’t even have a gig of RAM, much less a gig to allocate for file storage.

Enter tmpfs. In a nutshell, it’s like tmpfs, but the size can be dynamic, and it can swap, which means I don’t have to worry about my 16MB tmpfs partition trying to store 17MB of data and blowing up. Creation was eerily easy:

# Make a directory to use as a mountpoint, and give it a misleading name
mkdir /tmp/ramdisk

# Mount it as type tmpfs
mount -t tmpfs tmpfs /tmp/ramdisk -o size=16M,mode=1777

In my.cnf, it’s as easy as changing tmpdir=/tmp/ to tmpdir=/tmp/ramdisk/.

And now, we let it run for a while and see how performance feels.

DB Stats

I’ve been playing with phpMyAdmin and doing a bit of optimization of it. A few stats:

  • Since I upgraded the kernel, MySQL has been up for a little under 3 days and 11 hours.
  • The DB server has moved 841 MiB of traffic. This is 10 MiB an hour.
  • It’s processed 131,048 queries. This is about 1,580 an hour.
  • 132,000 inserted rows.
  • 96K queries served out of MySQL’s query cache.
  • 1,393 temporary tables created on disk to handle queries. This seems like a bottleneck, although it is only a tiny percentage.

I’ve just restarted MySQL to apply some configuration changes. (Actually, I could have changed them on the fly now that I think about it…) I tweaked the settings a bit: MySQL allows you to set limits on how much RAM it can use for various operations, and I tend to be very frugal. But I think I was shooting myself in the foot there: it was relying on disk a bit too much. It’s not like I’m running a load average of 25 and am moving gigs of traffic a day, where tuning is really vital, but it still bothers me that it’s not as efficient as it could be.

Focus

The other day my camera was in its “AF Hunt” mode, where it couldn’t seem to lock on focus. It’d focus past where it should be, and then turn around and focus back the other way, and just keep going. When you use the flash, it’s worse, because it’ll do a strobe flash to try to aid in focus, but it doesn’t help at all.

After a couple times of doing this, I finally got it focused, and just slid the switch on the lens from “AF” to “M,” disengaging automatic focus. It’d hold the focus that way, so it wouldn’t have to focus every time. (I was stationary, photographing something stationary, so there was no need to refocus every time.)

And then I put the camera down and bumped the lens, so the focus was off. So I just turned the focusing ring. And for the rest of the night, I left the camera in manual focus mode. I’ve found that I can do it just as quickly as the camera can focus the lens, and that’s when it works right: I don’t spin the focus back and forth ten times in a vain attempt to focus something.

Leaving it in manual focus also speeds up the shot: you press the shutter and it takes the picture instantly. There’s no waiting as it focuses.

So almost accidentally, I’ve become a fan of manual focus. Sometimes I’m lazy and want the camera to do it for me, but more often than not, I’m finding that I’d just as soon do it myself.

Geekery

One of my weird OCD concerns is that some of the scripts I host place a heavy load on the server. I want to make sure that, in busy times, they don’t weigh down things further. Here’s a neat little bit of PHP I wrote to simply have PHP abort the page load if the 1-minute load average is over 2.00:

// Check the uptime first
$fh = fopen('/proc/loadavg', 'r');
$uptime = fread($fh, '4');
fclose($fh);

if ($uptime>2) {
die("Sorry, we're too busy.");
}

Rather than die(), you might throw a redirect to a cache or something else. And I should point out that, of course, running this code does take some CPU time… And that this script doesn’t always make sense: you’re basically forcing a failure before the server itself forces the failure. The time it makes sense is in the way I’m using it — when some unimportant, tangential project requires inordinate resources and you want to make sure it doesn’t slow the server down too excessively, at the expense of the more important projects (e.g., the blogs).