DNS Dork

The real geeks in the room already know what the root zone file is, but for those of you with lives… DNS (Domain Name Service) is the service that transforms names (blogs.n1zyy.com) into IPs (72.36.178.234). DNS is hierarchical: as a good analogy, think of there being a folder called “.com,” with entries for things like “amazon” and “n1zyy” (for amazon.com and n1zyy.com, two sites of very comparable importance.) Within the amazon ‘folder’ is a “www,” and within “n1zyy” is a “blogs,” for example. A domain name is really ‘backwards,’ then: if it were a folder on your hard drive, it would be something like C:.com.n1zyyblogs.

Of course, this is all spread out amongst many servers across the world. When you go to connect to blogs.n1zyy.com, you first need to find out how to query the .com nameservers. The root servers are the ones that give you this answer: they contain a mapping of what nameservers are responsible for each top-level domain (TLD), like .com, .org, and .uk.

So you get your answer for what nameservers process .com requests, and go to one of them, asking what nameserver is responsible for n1zyy.com. You get your answer and ask that nameserver who’s responsible for blogs.n1zyy.com, and finally get the IP your computer needs to connect to. And, for good measure, it probably gets cached, so that the next time you visit the site, you don’t have to go through the overhead of doing all those lookups again. (Of course, this all happens in the blink of an eye, behind the scenes.)

Anyway! The root zone file is the file that the root servers have, which spells out which nameservers handle which top-level domains.

Yours truly found the root zone file (it’s no big secret) and wrote a page displaying its contents, and a flag denoting the country of each of the nameservers. The one thing I don’t do is map each of the top-level domains to their respective country, since, in many cases, I don’t have the foggiest clue.

What’s interesting to note is that a lot of the data is just downright bizarre. Cuba has six nameservers for .cu. One is in Cuba, one in the Netherlands, and four are in the US. Fiji (.fj) has its first two nameservers… at berkeley.edu. American universities hosting foreign countries’ nameservers, however bizarre, isn’t new. .co (Colombia) has its first nameservers in Colombia (at a university there), but also has NYU and Columbia University (I think they did that just for the humor of Columbia hosting Colombia).

In other news, it turns out that there’s a list of country-to-ccTLD (Country-Code Top Level Domain) mappings. I’m going to work on incorporating this data… Maybe I can even pair it up with my IPGeo page with IP allocations per country…

Inexcusable

Culled from recent news, here are some things that have occurred that I can find absolutely no excuse for having happened:

  • Hackers infiltrated computer systems, turning off power to several (foreign) cities. I guess it makes sense that the power grid would now be controlled by computers, but it’s sheer idiocy to have such a system, in any way, connected to the Internet. (And one has to suspect it was, in some manner, an inside job: I can’t imagine there’s a spiffy web GUI with a “Turn off power to Washington, DC” button, but rather some inscrutable interface.)
  • This is actually old news, but it was dug up recently: Mike Huckabee’s son was arrested for trying to bring a gun on an airplane. I’ll buy that it probably wasn’t his intention to hijack the plane, but how you “accidentally” carry a gun into an airport escapes me. Most of us are paranoid about whether our tiny bottle of shampoo is pushing the envelope and whether it’ll result in a cavity search. And yet people keep waltzing in with guns. Furthermore, anyone who doesn’t know where their guns are shouldn’t be allowed to carry them in the first place. (Despite what some have said, this doesn’t change my opinion of Huckabee himself… His statements like, “And that’s what we need to do — to amend the Constitution so it’s in God’s standards…” are what influence my views of him.)
  • Another case of a laptop with private data on more than half a million people going missing.

Get a (Virtual) Life

Amid wrestling with getting Xen working (its kernel doesn’t play nicely with my video drivers… oh how I hate closed-source drivers), I downloaded VMware player. It’s free.

I first downloaded a VMware image of Mailserver by Allard Consulting. Quick review: I’ve never used it in a ‘real’ environment to send or receive e-mail (and I screwed up VMware’s networking, making things worse), but it seems extremely impressive. The one thing I have realized is that my much-raved-about spamd is very irritating if you try to telnet to port 25 to ‘test’ the mailserver. If I had a colocated server hosting multiple VPSs cough I think I’d buy the ‘real deal’ from them and use this as my mailserver.

But I think I’m going to get entirely distracted with virtual machines tonight. I’m running the latest and greatest version of Ubuntu, 7.10, codenamed “Gusty Gibbon.” But 8.04, code-named “Hardy Heron” is in early testing, and you can grab an image of it. (You can also run it on your desktop, it’s in no way ‘proprietary,’ but a lot of us aren’t hardcore enough to want to run bleeding-edge alpha code as our main OS.)

I’ve mentioned before that I was somewhat interested in the $300 PCs that Walmart was selling. They came with Linux, apparently something Google partnered with them on, dubbing the desktop environment “gOS.” (The machine also draws insanely low power.) Lo an behold, it’s out there as a VMware image. (I was also able to play around with the One Laptop per Child (OLPC) image in VMware.)

Oh, and Solaris anyone?

Geolocation

The concept of matching an IP to a country is known as IP geolocation, often just “IPGeo” or “GeoIP.” There are lots of reasons for using IP geolocation, ranging from the mundane (identifying countries in your webserver logfiles) to the questionable (banning countries from your server to cut down on spam) to the neat (doing it at firewall/router level and redirecting a user to the closest data center).

Most of the work is just done on a country level. You take an IP (72.36.178.234, my server) and look it up in a database, and get “UNITED STATES” as an answer. There do exist databases on finer levels, down to the city, but they’re expensive and often wrong. (I keep getting ads to find hot singles in Mashpee, more than 100 miles away and in a different state… Or maybe it’s Mattapan. Whatever the case, they’re not even close.)

It turns out that you can download a free database of IP-country mappings. It’s not infallible, but they say it’s 98% accurate. The database itself won’t do you any good. It’s a compressed CSV (comma-separated variable).

In the comments section here, there’s a snippet of PHP code to take the CSV and convert it to a huge series of SQL inserts, which you input into a database… (Hint: for whatever reason, his preg_match is imperfect and leaves a few instances of the word “error” in the middle of the file. It’s probably a bad idea, but I just commented out the “echo error” line. I end up with a 5.7MB SQL query. You can also just download the thing directly here (warning: 5.7 MB SQL file). Note that, per the license terms, I disclose in the comments that it’s a derivative work of their CSV file.

The other important catch is that IPs are stored as long integers, not ‘normal’ IPs. You’ll presumably want to use PHP + MySQL to get the country associate with PHP, so I’ll provide pseudocode in a minute. PHP provides an ip2long() function, but it only takes you halfway, but leaves you with sign problems. (Argh!) It’s an easy fix, though, and you want something like the following:

$long = sprintf("%u", ip2long($ip));
$query = "SELECT a2,a3,country FROM ip2c WHERE start <= $long AND end >= $long";

You then, of course, run $query and parse through it… You get 2- and 3-letter country codes, as well as the full country name. I use it, with good results, in seeing what country comment spam is coming from. (Most of it comes from the US.)

A MySQL query isn’t the proper way to do this: there exist binary files with the same data that result in faster lookups. But this is the simplest way to start doing IP geolocation in ten minutes time, and, with the query cache enabled, there’s not a ton of overhead.

I’m tempted to write some scripts to allow people to ‘browse’ the database, either looking up an IP, or to view it by country.

Update: Weird Silence has a binary implementation of this same database that’s supposedly much faster. The main page is here, the PHP one is here, and the C one is (t)here. (I’m wondering if it makes sense to write a PHP script to call the C version, and what the performance implications would be?)

Update 2: Get your country flags here.

Amazon S3

I really didn’t pay it that much attention, or think about its full potential, at the time it was released. But Amazon’s Simple Storage Servic (hence the “S3”) is really pretty neat. In a nutshell, it’s file hosting on Amazon’s proven network infrastructure. (When have you ever seen Amazon offline?) They provide HTTP and BitTorrent access to files.

Their charges do add up — it might cost a few hundred dollars a month to move a terabyte of data and store 80GB of content. But then again, the reliability (and scalability!) is probably much greater than what I can handle, and it’s apparently much cheaper than it would be to host it with a ‘real’ CDN service.

Sadly, I can’t think of a good use for this service. I suppose the average person really doesn’t need to hire a company to provide mirrors of their files for download. (It would make an awesome mirror for Linux/BSD distributions, but I think the typical mirror is someone with a lot of spare bandwidth and an extra server, not someone paying hundreds a month to mirror files for other people… I wonder if there’s a market for a ‘premium’ mirror service? I doubt it, since the existing ones seem to work fine?)

Right Down through the Wire

It’s time! I’m going to go grab some lunch, but then I’m going out to cast my vote, run a couple errands, and then spend the rest of the day on Get Out The Vote activities. When the polls close at 8, I’ll breath a sigh of relief that I can sit down, but I think my nerves will be shot, too, as I go somewhere with my fellow supporters to watch the results come in.

New Hampshire residents, don’t forget to vote!

Fundraising

For whatever reason, we’ve been getting a lot of calls asking us to donate money to various causes all of a sudden. My mom did some research and unearthed some interesting information. Most of the calls come from “paid fundraising” companies. They take a percentage of what you donate–usually around 40%, it seems. We had the same person call us today on behalf of two separate charities. Both from the same company.

Should you find yourself in the same position, don’t fall for the irritating, “Can the {starving children, disabled veterans, cute kittens, abused children} count on you for support?” line. Respond by asking where they’re calling from, if it’s a paid fundraiser, and how much they get. If you’re feeling charitable when they call, thank them, and tell them you’ll make a donation directly to the charity.

You could make an argument that it’s simple economics, and that there’s even “good” being done–most charities don’t cold-call people, so they may be bringing in incremental donations. But, in my mind, it’s extremely sleazy to not fully disclose your own fiduciary interests when taking donations.

Polls

There was a whole round of new polls yesterday. Notice anything different? Polls are notoriously inaccurate, but Obama, just a week ago 10 points behind Hillary, is suddenly on top. As is pointed out on the site, we can’t rely too heavily on polls. But if a candidate is trailing pretty far in the polls and, in a week’s time, ends up as the front-runner, it’s a promising sign.

As an aside, they don’t show Richardson in the polls, but I’d be very interested to see how he’s done in the past week. He did great in last night’s debate: if I was an undecided I may well have latched onto him.

Busted

What’s remarkable about this election is that it seems that a lot of people are booing attack ads. It seems like I’m far from the only one that much prefers candidates to talk on how they can work together, not to take perpetual jabs at each other. Not only does it not move us forward, but it’s frankly irritating.

In tonight’s debate, Hillary seemed to be in attack overdrive mode. After one particularly pointed remark, John Edwards made a comment about how, before she finished third in Iowa, she didn’t seem to be so focused on the negative politics. Bravo, John.

Anyone who read my (admittedly lengthy and sometimes meandering) commentary on the 100 Club dinner last night–or anyone who went there–remembers one thing that seemed odd: Hillary fans were assigned to tables right by the stage, and right in front of the cameras. The Obama tables were cast into a corner, perpendicular to all the cameras, at a distance. I was somewhat peeved by this, but didn’t think too much of it.

I can’t believe I’m linking to Fox, but it turns out that a Fox reporter picked up on this, with surprisingly good insight. (For brevity, feel free to scroll about a third of the way down and start with the sentence beginning, “Never was that on display more clearly than at the 100 Club Dinner here Friday night.”)

I think I speak for almost everyone, not just fellow Obama supporters, when I say that this type of sneaky campaigning isn’t welcome here. When the Republicans tried phone-jamming our get-out-the-vote (GOTV) efforts years ago, we sent them to jail. We don’t like people who play dirty in New Hampshire, and any politician who thinks they can come into our state and pull the wool over our eyes is in for a surprise. Except it’s really no surprise, but rather, common sense: we like an honest, clean fight in which the best candidate wins, and the voters will speak on Tuesday.