Oh my!

It seems that the strong wind today, April 1st, has caused this server to serve posts with some of the letters upside down! We are aware of the problem and looking into it. Expect it to be resolved around the conclusion of April Fools Day. In the interim, you can click through to the actual post, which isn’t affected by this strange bug.

XBox

One thing I find interesting about technology is that sometimes a trivial technological thing has huge differences to the end user.

I’ve been playing Grand Theft Auto a bit in my spare time, on the Xbox 360. After re-arranging some things, I’ve run into a strange problem where, when I power it up, it loads older game files, not the newest. I know exactly what’s wrong, but it’s kind of like the “roger tone” to “FRS” leap–intuitively understanding what’s wrong here borders on savantism.

When I rearranged things, I didn’t bother to plug the Xbox back into my switch. I think the cable it was using is out in another room right now. So the Xbox has no Ethernet connection. Are you seeing why my game loads really old saved data yet? Hint: the game doesn’t use the network in any way, shape, or form.

The Xbox, when it’s connected to the Internet, will grab the correct time via the web. (I’ve wondered about this, actually: is it using NTP? Is it syncing to time.windows.com? I’ve been tempted to try packet sniffing, but it would basically require ARP poisoning, which I’m reluctant to do right now, as both the Xbox and my laptop are essentially on the school’s network, so it wouldn’t be too easy to “safely” do it.)

For some reason, though, when shut down, the Xbox never runs a “systohw” call (or at least, that’s what it is under UNIX) — the system clock, which was just synchronized and is quite accurate, is never written to the hardware clock. So two weeks ago, when I booted my Xbox, it was March 14, 2008. I saved a game, shut down the console, and went to bed. And then I rearranged stuff and realized that there was no reason for my Xbox to be online, so I moved the cable to the common room.

So the Xbox, now booting with no Internet connection, thinks it’s November of 2006, since the software clock never got committed to hardware. And the game, not anticipating bizarre things like this, automatically loads the game with the newest timestamp. As far as it’s concerned, the game I saved two weeks ago is a year and a half “newer” than what I saved earlier today.

So there you have it — whether I have an Ethernet cable hooked up or not changes the year on my Xbox, which causes it to load old games. And it’s all because the Xbox, for reasons I can’t understand, never writes the time to the hardware clock. (To me, this is a bug, and one that would require adding one line of code.) And it shows something neat (or scary, depending on your perspective) about programming — trivial details (like whether you sync the hardware clock to the software clock when you shut down) manifest in entirely unexpected ways, like which save file my video game opens.

Captchas

For those not aware, “captcha” is the name given to the little images with distorted text. The premise is that a human can figure out what they say, but that a computerized “bot” cannot. Thus they’re used to keep people from writing scripts to sign up for hundreds of accounts, or to prevent spammers from leaving comments. (Incidentally, there are some clever ways to defeat captchas. The most creative was a group of people that apparently started a “free” porn site, where users only had to complete a captcha to sign up. Except that the captcha actually came from another site: they were essentially getting hundreds of porn-starved people to help them bulk-register for various accounts!)

Anyway, besides causing major problems for the visually-impaired, there’s another problem with captchas… Consider the one I got the first time I tried to sign up for Hulu:

BitTorrent is Cool

Having recently pulled down some updates via BitTorrent, I discovered a cool neat thing about the protocol. Obviously, it’s basically a peer-to-peer filesharing tool. But it has some neat things that keep it working well. Files are split up into many pieces, and each of those chunks can be downloaded from anyone. (Apparently, various file-integrity provisions exist, too, to help guard against people injecting garbage.)

The first neat thing is the concept of “choking” selfish systems. As I download chunks, my torrent client will automatically start sharing the completed chunks. If my client detects that you’re downloading completed pieces I have, but not sharing the completed pieces you have, you get “choked,” or banned. I stop sharing with you. (Periodically, an “optimistic unban” will kick in, giving you another chance.) This greatly increases the incentive for you to share files: otherwise, everyone would want to download only, meaning that very few people had the file.

The obvious problem is that the file, if one piece is missing, is useless. If you take a random 1MB chunk out of the middle of Microsoft Office, the whole program will fail to work. (Not that I condone downloading MS Office via BitTorrent. After all, it’s free from school!) So it’s important to make sure that no pieces become unavailable. So most clients implement a neat algorithm, called “rarest first.” The name sums it up pretty well: as clients go out advertising what pieces of the file it has, it will go out and grab the least-available pieces first. And after I finish that piece (and, by necessity, begin advertising that piece to peers), I go and get the next-rarest piece. Since the whole is useless without all the parts (the whole point of the rarest-first system), it doesn’t matter what order I acquire them in, thus permitting each client to help raise availability.

Overall, the more I read about the inner workings, the more impressed I am.

Public Safety

For those of you who don’t monitor police scanners regularly, I’d like to introduce what can be considered a fairly scary fact: their computer systems go down all the time.

Where it usually comes up is when they try to run a license plate or a person, or to query NCIC or similar. The officer calls it in and waits a few minutes, before the dispatcher calls back that the (remote) system is down. When you’re monitoring multiple neighboring towns, you’ll often notice that they all lose it at once. The backend servers are going down.

This drives me nuts. It’s usually not a huge deal, but now just imagine that you’re the police officer, and the guy you pull over, but can’t run through the system, actually has a warrant out for his arrest. For murdering a police officer. But you have no clue, because the system is down. Of course this is extreme, but it’s always been said that traffic stops are actually the most dangerous and unpredictable things an officer does. They never know whether it’s a nice old lady or someone with a warrant out for their arrest. A decent amount of arrests come from pulling people over for traffic violations and finding subsequent violations, like cocaine or guns, or an outstanding warrant.

My webserver sits in Texas on what’s basically an old desktop system. And it seems to have better uptime than these systems. As biased as I am in favor of my blogs, even I will admit that police databases are more important. Further, if my blogs were routinely unreachable, I’d be furious with my hosting company. Why is it tolerated when this happens?

Databases are fairly easy to replicate. Put a “cluster” of database nodes in a datacenter. You’re protected against a hardware failure. Of course, the data center’s still a single point of failure. So put another database node in a separate datacenter. That alone is probably all you’ll ever need. But you can keep turning up more database nodes in different locations as budget permits. (I suspect budget is the limiting reactant.)

But you can take it one step further. Set up another database node, not in a lonely datacenter, but in a large dispatch facility. (The MA State Police apparently run a very large 911 answering center.) So they get a database node there, that doesn’t answer public queries, but that receives updates from other database servers. And, in the event of some sort of catastrophic failure, remote dispatchers can call up and request that something be run.

I’m just really bothered that people seem to find it acceptable that, probably at least once a week, the system is unreachable for quite some time.

Digital Photo Recovery

I just discovered PhotoRec, a tool for recovering digital camera images.

For the non-geeks, a quick basic background…. When you save a file, it writes it to various blocks on the disk. Then it makes an entry in the File Allocation Table, pointing to where on the disk the file is. When you delete a file, the entry is removed from the File Allocation Table. That’s really all that happens. The data is still there, but there’s nothing pointing to where on the disk it is. This has two implications. The first is that, with appropriate tools and a little luck, you can still retrieve a file that you’ve deleted. (Whether this is comforting or distressing depends on your perspective…) The second is that, with no entry in the File Allocation Table, it’s seen as “free space,” so new files saved to the disk may well end up getting that block. It’s technically possible to recover stuff even after it’s been overwritten, but at that point it’s much more complex and much more luck is involved.

Last night we went out to dinner… We took lots of photos, but some were deleted. So I figured PhotoRec might recover them. So I gave it a try.

The filesystem shows 163 photos. After running PhotoRec, I have 246 photos. What’s odd is what photos I have. It’s not the ones from last night. They’re scattered from various events, and several are from almost two months ago.

This does leave us with an important tip, though: if you delete an essential photo, stop. Each subsequent thing you do to the disk increases the odds of something overwriting it. In a camera, just turn it off. Taking more photos seriously jeopardizes your ability to recover anything.

In my case, I didn’t have anything really important… I just wondered how it would work. And I got strange results for recovered files. (Which has me wondering a lot about how its files get written out to disk, actually.) But it’s good knowledge for the future. (By the way, PhotoRec runs under not just Linux, but also, apparently, Windows, and most any other OS you can imagine.)

Dork

Yesterday was, for all intents and purposes, a snow day. They closed the school down at 1. Of course, I had no classes anyway, just some work that could be done anywhere. But this was a snow day. You don’t do work. At least, not the work you’re supposed to.

Kyle, always being curious about the hardware side of things, sent me a link to the RoomWizard downloads page after fishing out the hardware specs elsewhere. There were two things that interested me–one was that you could download a firmware image. The other was that they had a PDF of how to use their API.

Wait… API? That means… it’d be trivial to write an interface to these things!

The problem is that the manual never mentions the actual address of the API, which is just accessed over HTTP and returns XML. They give a few examples–/rwconnector is used most often. But alas, /rwconnector on these throws a 404.

Somewhat discouraged, I started poking around the firmware image. It’s a .tar.gz, and extracts… a (fairly) normal Linux filesystem. Besides some juicy stuff that I hope admins are instructed to change (there are several privileged user accounts), I also found some neat stuff. For one, it’s based on SuSE, but a very trimmed-down version. And it’s basically a full-functioning Linux machine, including an SMTP server, Apache Tomcat, etc.

But then I hit gold. There’s a configuration file for Tomcat, which mentions one URL of /Connector. So I fired it up and tried it in on one of the systems. Bingo!

So then I read a bit more of the API manual. It’s actually very simple–you can retrieve, edit, and delete bookings. (The edit and booking doesn’t let you do anything you can’t do via the web interface, by the way, lest anyone think this is a security flaw.) You get an XML document back with results.

So then I had to figure out how to get PHP to parse XML. It turns out that PHP actually has several ways to do it, including SimpleXML and DOM objects. I spent a while learning it and by the end of the day, I had a prototype working that would get reservations for the next 24 hours and parse out the information. (Small tip–don’t try to “escape” colons when dealing with XML. They denote a namespace. When you get rb:name, for example, the tag name is just name, in the rb namespace. Knowing this a little sooner would have saved me about half an hour of, “This code is so simple! Why doesn’t it work?!”)

The next step is to insert all of this into an SQL database, and then write a nice viewer for it. And also to experiment with adding bookings, although that should just require changing a line of code.

I haven’t actually written code to do timing, but it feels like it’s 1-2 seconds for me to get the XML data back, which suggests that the bottleneck is in its little database. Short-term, I want to write myself a little interface that will parse all the data, cache it, and give me a faster interface. Long-term, I want to try to see if I can get the library to adopt this, and have it be the booking mechanism. You can store them to a local database, and then have a background process use the API to push reservations out to the respective RoomWizards, so that they continue to function normally. But when people view the page, it’ll just get it all from the local database, meaning that the whole “Get the listings via API” thing is no longer necessary. (Unless you want to rebuild the database in case of a disk failure!)

Criticizing Web Apps

As long as I’ve posted a lengthy diatribe about how awful the library room-booking web interface is, there are two more that drive me nuts.

We have a way of putting in work orders for maintenance. Last semester I tried to open one of our windows and it just fell out. This semester, we had three different light fixtures burn out in 2 days time. So you go online and put in a work order. This is a great thing to have web-based. Except they picked this insane system that opens multiple browser windows, resizes your browser (?!), uses copious JavaScript requiring you to double-click on links… And it only works in IE. Oh, and there are irritating things that could be fixed with one line of code… You log in with your student ID, which is eight-digits that inexplicably have an @ sign in front of them. So they have this big note on how you cannot use the at sign, you must only use your eight-digit number. One line of code could just strip it out if it was included.

Much like booking library rooms, submitting help tickets is a Programming 101 exercise. In fact, it’s easier than the library interface, because you don’t have to do time calculations. You have an employees table, a clients table, and a work table. Tasks get entered into work by the client, and the staff assigns an employee to it. And when it’s done, you set work.status to “Complete,” a simple ENUM field. This is like 45 minutes of coding, although I’d probably spend more time prettying up the interface.

Then there’s the computer help desk, another web app. For one thing, all the links to it point to an http:// URL. But if you actually use them, it barfs up an error that you must view it over a secure channel. Being a web dork, I just take “s” onto the end of “http” and life is golden. To someone who’s not so good with computers, and who’s already at wits’ end with their computer, they’re probably going to break down and cry, because even the help desk webpage doesn’t work for them.

This, too, only works in IE. In this case, they didn’t have copious bizarre crap (like requiring double-clicking on links), so I set Firefox to pretend it was IE. The page loads okay, but looks terrible, with nothing lining up right. IE and, well, the rest of the world, have differing views on how lots of things are done, but requiring IE really isn’t the best solution. Oh, and as an added bonus, they control your mouse cursor, preventing it from indicating links in any manner. This means that someone took time to write code that does nothing but decrease usability.

But worst of all, even if you use IE like they demand, if you actually try to click on any tickets to view them, you get taken to a random system with a long canonical hostname, which just throws you “HTTP 400 – Bad Request.”

So last night, I submitted a help desk ticket indicating that the help desk is broken. Because, frankly, it doesn’t work. All of its internal links take you to the wrong server (or, seemingly, the right server but with the wrong hostname), and that’s assuming you’re smart enough to get in, by understanding the error indicating that you need to use HTTPS, not HTTP.

Most of these things are sold as turnkey devices, it seems. Maybe I should start a company making them. Apparently, no technical expertise is required to do so.

RoomWizard

Even though I got to a business school and am a management major, my real passion is working on websites.

We just build a new library here, for millions and millions of dollars. We use a tool called RoomWizard for booking rooms. We get a web-based interface to book library rooms. This is a great idea. Unfortunately, it’s so fraught with bugs that it borders on unusable.

The main “bug” is that it’s basically so slow that it’s unusable. I tried viewing the source, and it’s got a HUGE block of JavaScript that’s a pain to read. Most of the page is being generated on the fly with JavaScript. There are times when this is the best way to do something. This is not one of them.

My current understanding–I may be wrong, since I’m still trying to make sense of this–is that each of the touch-screen units on the wall is a webserver. It’s responsible for storing all of its reservations. So when you view the main page, JavaScript has you going out to each of the 20+ rooms and requesting their status. The problem is that this takes forever, probably at least 15 seconds. By the time the page has finished drawing, it’s about time for the 60-second refresh to kick in.

I did a bit of viewing headers. The main page is running on ASP.net, but each individual room controller (probably like a 300 MHz embedded chip?) is running Apache Tomcat. Someone did a quick port scan and found that the devices have a lot of open ports–ftp, ssh, telnet (!), HTTP, and port 6000, which nmap guessed was X11. So I have a pretty good feeling these things are running embedded Linux.

Another problem is that there’s always one or two of the devices that, for whatever reason, are unreachable. So you get errors on those ones.

Booking conference rooms is like a Web Programming 101 interface. You get a basic introduction to SQL databases, and write a little interface. You could run this on an old 1 GHz PC with 128 MB of RAM and have pages load in fractions of a second, especially if you really knew how to configure a webserver. (Turn on APC and MySQL query caching, in this case, and you’re golden.) I cannot fathom why they thought it was a good idea to have a page make connections to 25 different little wall-mounted touchscreens. This places a big load on what have got to be underpowered little units, and is just a nightmare any way you look at it. I really see no benefit to what they’re doing.

Furthermore, this breaks off-campus connections, since you can’t connect to these units remotely.

You convert the wall-mounted RoomWizards from embedded webservers into a little web browser client, and they just pull down the data from the main server.

With a traditional, single database, it would also be easy to write a little search tool–“I need a room on Friday from 3:00 to 5:00.” This is a fairly simple SQL query. This is not a fairly simple question to ask 25 wall-mounted touchscreen things.

I’m tempted to write a little PHP script to go out, retrieve the data, and cache it. Essentially a hacked-together proxy…