BitTorrent

A few tips, in the hopes that it’ll help someone else. (Aside: don’t download illegal stuff with BitTorrent. Do download the many awesome, legal things on BitTorrent, such as Ubuntu torrents.)

  • You can encrypt your BitTorrent traffic, which is meant at circumventing ISPs that feel like being pains and blocking traffic. However, “Enabled” isn’t the value you want. You want “Forced.” In uTorrent, this is under Preferences -> BitTorrent.
  • If you don’t upload at all, other nodes will “choke” you by refusing to talk to you. It doesn’t seem to me like it has to be entirely equitable; I’ve capped my upload at a pretty small number, but am downloading around 100 kB/second (800 kbps).
  • You’ll have a port number for incoming connections. If this port isn’t coming through (such as if you have a “default-deny” policy), things will work, but they’ll be unbearably slow. As an aside, if you’re behind an OpenBSD firewall (using pf), have a local IP of 192.168.1.79, and use the randomly-selected port 26689 as your local port for BitTorrent, the firewall rule looks like rdr on $ext_if proto tcp from any to any port 26689 -> 192.168.1.79 port 26689. Remember to flush the rules (pfctl -F rules) and then (possibly required? possibly done automatically with the flush?) load them back in (pfctl -f /etc/pf.conf).

With these three principals in mind, my (legitimate) download went from 0.8 kB/sec to 145 kB/sec.

Huh, a neat tip… If you pick a torrent from one site, but it’s something identical to what other sites have, add the additional trackers in to the first download, which will give you more peers!

Oh, another tip: don’t arbitrarily set a download limit! My downloads wouldn’t break 145 kB/sec or so, until I realized that I’d set a limit of 150 kB/sec. I removed the limit and am suddenly at 400 kB/sec. (Incidentally, our available bandwidth has suddenly plunged to nothing…)

One final note: Peer Guardian is good, but don’t run it unnecessarily, since it blocks a lot of legitimate traffic. Including, oddly, Steam’s servers (for games like Counter-Strike and TF2), apparently because they use Limelight’s CDN, and they’ve dubbed Limelight bad?

Public Safety

For those of you who don’t monitor police scanners regularly, I’d like to introduce what can be considered a fairly scary fact: their computer systems go down all the time.

Where it usually comes up is when they try to run a license plate or a person, or to query NCIC or similar. The officer calls it in and waits a few minutes, before the dispatcher calls back that the (remote) system is down. When you’re monitoring multiple neighboring towns, you’ll often notice that they all lose it at once. The backend servers are going down.

This drives me nuts. It’s usually not a huge deal, but now just imagine that you’re the police officer, and the guy you pull over, but can’t run through the system, actually has a warrant out for his arrest. For murdering a police officer. But you have no clue, because the system is down. Of course this is extreme, but it’s always been said that traffic stops are actually the most dangerous and unpredictable things an officer does. They never know whether it’s a nice old lady or someone with a warrant out for their arrest. A decent amount of arrests come from pulling people over for traffic violations and finding subsequent violations, like cocaine or guns, or an outstanding warrant.

My webserver sits in Texas on what’s basically an old desktop system. And it seems to have better uptime than these systems. As biased as I am in favor of my blogs, even I will admit that police databases are more important. Further, if my blogs were routinely unreachable, I’d be furious with my hosting company. Why is it tolerated when this happens?

Databases are fairly easy to replicate. Put a “cluster” of database nodes in a datacenter. You’re protected against a hardware failure. Of course, the data center’s still a single point of failure. So put another database node in a separate datacenter. That alone is probably all you’ll ever need. But you can keep turning up more database nodes in different locations as budget permits. (I suspect budget is the limiting reactant.)

But you can take it one step further. Set up another database node, not in a lonely datacenter, but in a large dispatch facility. (The MA State Police apparently run a very large 911 answering center.) So they get a database node there, that doesn’t answer public queries, but that receives updates from other database servers. And, in the event of some sort of catastrophic failure, remote dispatchers can call up and request that something be run.

I’m just really bothered that people seem to find it acceptable that, probably at least once a week, the system is unreachable for quite some time.

This Is My Hobby

I want to start a “meta ISP.”

When you sign up with your ISP, you’re paying for transit. They carry your data from one network to the other.

But now let’s say that I’m a mediocre residential ISP. I buy connectivity from a couple different upstream providers, and use BGP to make sure your data takes the fastest route. This is what most people do. It works.

Let’s further say that you run an extremely popular site, maybe one of the top 100 sites out there. You have a mediocre IT team. You have enormous bandwidth, coming in from three different carriers. You, too, use BGP to make sure that your outgoing traffic takes the quickest route.

So everything works. Traffic flows between the two networks. What’s the problem?

Well, it turns out that you, Mr. Big Site, have some of your core routers in a major data center out this way. And I, Mr. Big ISP, also have a few core routers in that building. This is really pretty common–there’s a (very aptly-named) network effect with transit. When several big guys move into a building, all of a sudden, more people want to be there too. So you get sites like One Wilshire, a thirty-story building in LA full of networking equipment. They’re very confidential about their tenants, but “word on the street” is that every network you’ve heard of, and plenty you haven’t, is in there. (When viewing that picture, by the way, it’s worth noting that these wires don’t go to some secretary’s PC. Each is probably carrying between 100 Mbps and 10 Gbps of traffic between various ISPs and major networks… Also an interesting note to the photo, they supposedly keep an elaborate database and label each wire, so that this huge rat’s nest is actually quite organized.)

Since we’re both huge companies, we’re each paying six figures a month on Internet. But when one of my customers views your site, they go through a few different ISPs, and across multiple states, before it arrives on your network. It’s asinine, but that’s how the networks work.

So we wise up to this. I call you up, and we run a Gigabit Ethernet line between our racks. And all of a sudden, life is peachy. Data travelling over that line–my customers viewing your site–is free. My bandwidth bills drop, and speeds improve, too. This is the world of peering. And, strangely, the mutually-beneficial practice is rarely done.

I think there’s a market for a big middleman here. The last mile (that would be a good book title, if a telecom magnate wanted to write his memoirs) is difficult–running lines to consumers’ homes. Similarly, it’s hardly trivial to become a Tier 1 ISP, a sort of ‘core backbone’ of the Internet. But an intermediary broker? Easy enough to do.

So you’d get space in the major exchanges, and peer with popular sites. Google, Yahoo, MSN, Youtube, Facebook, eBay, Myspace, Amazon, Akamai, etc.

Dork

Yesterday was, for all intents and purposes, a snow day. They closed the school down at 1. Of course, I had no classes anyway, just some work that could be done anywhere. But this was a snow day. You don’t do work. At least, not the work you’re supposed to.

Kyle, always being curious about the hardware side of things, sent me a link to the RoomWizard downloads page after fishing out the hardware specs elsewhere. There were two things that interested me–one was that you could download a firmware image. The other was that they had a PDF of how to use their API.

Wait… API? That means… it’d be trivial to write an interface to these things!

The problem is that the manual never mentions the actual address of the API, which is just accessed over HTTP and returns XML. They give a few examples–/rwconnector is used most often. But alas, /rwconnector on these throws a 404.

Somewhat discouraged, I started poking around the firmware image. It’s a .tar.gz, and extracts… a (fairly) normal Linux filesystem. Besides some juicy stuff that I hope admins are instructed to change (there are several privileged user accounts), I also found some neat stuff. For one, it’s based on SuSE, but a very trimmed-down version. And it’s basically a full-functioning Linux machine, including an SMTP server, Apache Tomcat, etc.

But then I hit gold. There’s a configuration file for Tomcat, which mentions one URL of /Connector. So I fired it up and tried it in on one of the systems. Bingo!

So then I read a bit more of the API manual. It’s actually very simple–you can retrieve, edit, and delete bookings. (The edit and booking doesn’t let you do anything you can’t do via the web interface, by the way, lest anyone think this is a security flaw.) You get an XML document back with results.

So then I had to figure out how to get PHP to parse XML. It turns out that PHP actually has several ways to do it, including SimpleXML and DOM objects. I spent a while learning it and by the end of the day, I had a prototype working that would get reservations for the next 24 hours and parse out the information. (Small tip–don’t try to “escape” colons when dealing with XML. They denote a namespace. When you get rb:name, for example, the tag name is just name, in the rb namespace. Knowing this a little sooner would have saved me about half an hour of, “This code is so simple! Why doesn’t it work?!”)

The next step is to insert all of this into an SQL database, and then write a nice viewer for it. And also to experiment with adding bookings, although that should just require changing a line of code.

I haven’t actually written code to do timing, but it feels like it’s 1-2 seconds for me to get the XML data back, which suggests that the bottleneck is in its little database. Short-term, I want to write myself a little interface that will parse all the data, cache it, and give me a faster interface. Long-term, I want to try to see if I can get the library to adopt this, and have it be the booking mechanism. You can store them to a local database, and then have a background process use the API to push reservations out to the respective RoomWizards, so that they continue to function normally. But when people view the page, it’ll just get it all from the local database, meaning that the whole “Get the listings via API” thing is no longer necessary. (Unless you want to rebuild the database in case of a disk failure!)

RoomWizard

Even though I got to a business school and am a management major, my real passion is working on websites.

We just build a new library here, for millions and millions of dollars. We use a tool called RoomWizard for booking rooms. We get a web-based interface to book library rooms. This is a great idea. Unfortunately, it’s so fraught with bugs that it borders on unusable.

The main “bug” is that it’s basically so slow that it’s unusable. I tried viewing the source, and it’s got a HUGE block of JavaScript that’s a pain to read. Most of the page is being generated on the fly with JavaScript. There are times when this is the best way to do something. This is not one of them.

My current understanding–I may be wrong, since I’m still trying to make sense of this–is that each of the touch-screen units on the wall is a webserver. It’s responsible for storing all of its reservations. So when you view the main page, JavaScript has you going out to each of the 20+ rooms and requesting their status. The problem is that this takes forever, probably at least 15 seconds. By the time the page has finished drawing, it’s about time for the 60-second refresh to kick in.

I did a bit of viewing headers. The main page is running on ASP.net, but each individual room controller (probably like a 300 MHz embedded chip?) is running Apache Tomcat. Someone did a quick port scan and found that the devices have a lot of open ports–ftp, ssh, telnet (!), HTTP, and port 6000, which nmap guessed was X11. So I have a pretty good feeling these things are running embedded Linux.

Another problem is that there’s always one or two of the devices that, for whatever reason, are unreachable. So you get errors on those ones.

Booking conference rooms is like a Web Programming 101 interface. You get a basic introduction to SQL databases, and write a little interface. You could run this on an old 1 GHz PC with 128 MB of RAM and have pages load in fractions of a second, especially if you really knew how to configure a webserver. (Turn on APC and MySQL query caching, in this case, and you’re golden.) I cannot fathom why they thought it was a good idea to have a page make connections to 25 different little wall-mounted touchscreens. This places a big load on what have got to be underpowered little units, and is just a nightmare any way you look at it. I really see no benefit to what they’re doing.

Furthermore, this breaks off-campus connections, since you can’t connect to these units remotely.

You convert the wall-mounted RoomWizards from embedded webservers into a little web browser client, and they just pull down the data from the main server.

With a traditional, single database, it would also be easy to write a little search tool–“I need a room on Friday from 3:00 to 5:00.” This is a fairly simple SQL query. This is not a fairly simple question to ask 25 wall-mounted touchscreen things.

I’m tempted to write a little PHP script to go out, retrieve the data, and cache it. Essentially a hacked-together proxy…

It’s a Game, Sam

Tonight we toured Waltham’s 911 center. They told us to take the elevator up, so all 14-ish of us climbed in. The doors shut behind us, and then… Nothing happened. At all. We started joking about how funny it would be if we had to call 911 to tell them we were stuck in an elevator… in their office. But as the time passed, the joking gave way to a fearful realization that nothing was happening.

A minute later the doors randomly opened and we decided it would be best to take the stairs.

They showed us their dispatch interface… It looks like it’s Java-based, although it didn’t have the stereotypical ugly Swing GUI. The interface was modeled a lot like a mail application: a “tree” on the left, and two panes on the right. The left tree had three categories: Unassigned calls, Active calls, and Closed calls. It’s kind of neat, though: they have a dedicated calltaker, and a couple dispatchers (who answer calls if the primary calltaker is busy). So as he talked to us about how the system worked, we’d watch stuff pop up in the Unassigned category, and a timer would start. With any luck, it’d get moved into the active category in a matter of seconds, denoting that units had been dispatched. They had about half a dozen actives at any given time. Some lasted a few minutes: a traffic stop would pop up and close a few minutes later. Others were much longer-lasting. A call to check the well-being came in as he was talking, so he clicked on it to show us how things worked.

It opened the details up in the main two panes. It showed the address, written directions, and other various stuff. Below that was a scroll list with all sorts of entries, essentially notes each person entered. A few lines from the call-taker: “[Name] hasn’t been seen since Friday” and stuff of that nature. Some of the notes are automatically added. One looks up gun permits at the address. This one was an apartment complex, and we saw the classic message, “Too many gun permits to list.” (But clicking on the address pops up a web browser page listing every single one.) Another note adds, “3 prior calls at this address” or something of the sort.

Below that was a list of every officer dispatched, color-coded to show their current status. We visited an ‘older’ call to show more, and it showed an ambulance and fire engine which had cleared, and a couple officers still on scene. We didn’t go into it, but the buttons suggest that the system will permit the dispatcher to automatically determine which units to dispatch. (There’s also a “Roster” menu item he showed, which lists every single officer, the sector they’re patrolling, and their current status.)

I’m also impressed at how advanced some of their other stuff is. It’s nothing new for every call to be logged, along with all radio traffic. But what is new, at least to me, is for it all to be stored on a computer with a little GUI. The other day I was at Campus Police researching police log entries, and the dispatcher took a (very low-priority) call. An officer was asking about some specific detail, so he just clicked a few things on the computer and played back the call. On the radio side of things, in addition to displaying unit IDs, everything gets logged to disk, too.

He also talked about the psychological aspect of the job, which was actually quite interesting. He had some training material which consisted of past calls (not sure whether they’re from the department or not?). In one, he plays back a female caller who’s screaming and wailing. You hear a passing allusion to a gun, and then get an insanely detailed description of a car, and then more screaming. Thirty seconds into the call, he paused it. “So what’s this call about?” We collectively shrugged our shoulders. He kept playing, and the dispatcher finally asks if someone is hurt. We get a no, and more information about the car. Two minutes go by, and we’re still not clear what’s going on. He stops the clip at that point, and talks about how one of the most important things they do is taking charge of the call.

Then he switches to another one. It starts off the same way–screaming. But the dispatcher here is much better. “Calm down, I need you to tell me what happened.” We get that someone was stabbed, and more screaming. “Just send the police! Send the police!” “Ma’am, they’re already on the way. Who is stabbed? Who else is there?” The victim is named (not that the name is what they needed right then, but I digress). The dispatcher prods a little more about the woman’s condition, and then adds, “Can you go check? Is anyone else there?” “Yes, her husband. He’s screaming.” “Are you going to be in danger if you go check on her?” “Yes. He has a knife!” It’s a neat example, because it really changes things. It starts off sounding like a simple medical call, and the caller utterly fails to mention the guy running around with a knife until the dispatcher prods him for enough details. The first officers arrived about 90 seconds after the call came in, and they show up already knowing exactly what they’re facing.

I’m left thinking that some of these skills are things that could probably be applied elsewhere. All too often we rely on what other people say and do, when it’d really be better to take control over the situation. As a mundane example, this type of thing happened all the time at work, when people would come up to me with rambling stories and questions. So rather than directly answering their questions, you take control of the call. “Do you have a waiting list?” lead me to ask about the size of the group. And from there, I’d either tell them we had no list and make a mental note that we had a group coming in, or I’d tell them we did have a list, but the fact that I’d already asked about the size of their group somehow made them seem more receptive to me taking putting them on our waiting list, as opposed to them not coming in.

The whole dispatching thing is vaguely reminiscent of games, though. Rather than deciding where to place a teleporter and sentry gun in TF2, they’re deciding what police cars and fire trucks to send to a given location. (He described it as something vaguely like chess.) Rather than being a “shoot ’em up” game, it’s a strategy game.

Tweaking SQL

I was thinking last night about solid-state drives. In their current form, they’re really not that much faster in terms of throughput: a decent amount are actually even slower than ATA disks if you measure them in terms of MB/sec throughput. Where they shine (100 times faster, at least) is seek time, though. So where they’re ideally suited for in a server environment right now is something with lots of random reads, where you might find yourself jumping all over the disk. For example, a setup with lots and lots of small files scattered across the disk.

Many implementations of a database would be similar. Something like the database for this blog will have a lot of sequential reads: you’re always retrieving the most recent entries, so the reads tend to be fairly close. But there are lots of ways to slice the data that don’t result in reading neighboring rows or walking the table. (And what really matters is how it’s stored on disk, not how it’s stored in MySQL, but I’m assuming they’re one in the same.) Say I view my “Computers” category. That’s going to use reads from all over the table. Using a solid-state disk might give you a nifty boost there. So I think it’d be fun to buy a solid-state disk and use it in an SQL server. I wager you’d see a fairly notable boost in performance, especially in situations where you’re not just reading sequential rows.

But here’s the cool link of this post. I’m not sure exactly what goes on here in a technical sense, but they use solid-state drives, getting the instant seek time, but they also get incredible throughput: 1.5GB/sec is the slowest product they offer. I think there may be striping going on, but even then, with drives at 30MB/sec throughput, that’d be 50 drives. The lower-end ones look to just be machines with enormous RAM (16-128 GB), plus some provisions to make memory non-volatile. But they’ve got some bigger servers, which can handle multiple terabytes of storage on Flash, and still pull 2GB/sec of throughput, which they pretty clearly state isn’t counting stuff cached in RAM (which should be even faster).

I want one.

Do you have the time?

I’ve been running an NTP server on this host for quite some time now. But as of yesterday, I’m a member of the pool.ntp.org group. pool.ntp.org is a round-robin-ish DNS service where requests for pool.ntp.org are given IPs from a huge block of listed nameservers, balancing the load across a pool of about 1,500 NTP servers across the world. The official “entry” for this server is my IP (72.36.178.234), but ntpd is actually listening on all IPs right now, so using blogs.n1zyy.com or ttwagner.com will work.

I’m currently synced to Stratum 2 servers, but I think that, after I finish up some open tasks (“real work,” versus playing with time servers), I’m going to look at requesting permission to sync to Stratum 1 servers. (Stratums, err, strata, are basically tiers. “Stratum 1” refers to a server directly connected to something like a GPS (which obtains extremely accurate time: having the correct time is an important part of how GPS works, so GPS actually broadcasts the time from the atomic clock) or from WWV (transmitted over HF radio). Stratum 2 servers get their time from Stratum 1 servers, and so on. As I sync to a network of stratum 2 clocks, I become a stratum 3 server. Moving up a stratum generally implies more accurate time, as there are fewer intermediaries to skew results. (Although we’re talking milliseconds of difference.) There aren’t an awful lot of stratum 2 servers, so syncing to a stratum 1 server would help to round out the stratum 2 list. (It would be fun to become a stratum 1 server, but as a stratum 2 host says of his data center, “they’re not going to let me drill a hole in the ceiling to run an antenna [for the GPS] to the roof.”)

For those of you with UNIX systems, take advantage of this! You can sync to me directly (72.36.178.234), or indirectly (the pool.ntp.org cluster). (Windows can sync to an NTP server as well, it’s just not a standard feature.)

Web Design

I’ve redone ttwagner.com. It’s no longer a random integer between 0 and 255, but instead, a decent-looking site. I’ve integrated some of the cool things I’m hosting there as well. I came across a few interesting things I wanted to point out.

The world DNS page is incredibly intensive, and, since it’s not dynamic, there’s no sense in “generating” it each time. So I used the command wget http://localhost/blah/index.php -O index.html to “download” the output, and save it as index.html in the web directory. Viola, it serves the HTML file rather than executing the script.

But the HTML output was frankly hideous. The page was written as a, “You know, I bet I could do…” type thing, written to fill some spare time (once upon a time, I had lots of it). So I’d given no attention to outputting ‘readable’ HTML. It was valid code and all, it just didn’t have linebreaks or anything of the sort, made it a nightmare to read. But I really didn’t want to rewrite my script to clean up its output so that I could download it again….

So I installed tidy (which sometimes goes by “htmltidy,” including the name of the Gentoo package). A -m flag tells it to “modify” the file in place (as opposed to writing it to standard output). The code looks much cleaner; it’s not indented, but I can live with that!

I also found that mod_rewrite is useful in ways I hadn’t envisioned using it before. I developed everything in a subdirectory (/newmain), and then just used an htaccess override to make it “look” like the main page (at ttwagner.com/ ). This simplifies things greatly, as it would complicate my existing directory structure. (It’s imperfect: you “end up” in /newmain anyway, but my goal isn’t to “hide” that directory, just to make the main page not blank.)

I’ve also found I Like Jack Daniel’s. (Potential future employers: note the missing “that” in that sentence, which changes the meaning completely!) The site is a brilliant compendium of useful information, focusing on, well, Apache, PHP, MySQL, and gzip, generally. The “world DNS” page was quite large, so I decided to start using gzip compression. He lists a quick, simple, and surefire way to get it working. (The one downside, and it’s really a fundamental ‘flaw’ with compression in general, is that you can’t draw the page until the whole transfer is complete. This has an interesting effect as you wait for the page to load: it just sits there not doing much of anything, and then, in an instant, displays the whole page.) It may be possible to flush the ‘cache’ more often, resulting in “progressive” page loading, but this would be complicated, introduce overhead, and, if done enough to be noticeable, also defeat the point of compression. (Extreme example: Imagine taking a text file, splitting it into lots and lots of one-byte files, and then compressing each of them individually. Net compression: 0. Net overhead: massive!)