Google for Geeks

One thing that consistently amazes me is how technology-related things are always at the top on Google, even when they probably shouldn’t.

For example, I just Googled, “SPF,” thinking I should send up the Sender Policy Framework to help cut down on spam. Right after hitting “Search,” I suddenly realized that I’d get a lot results about sunscreen. Nope! The first link is what I was looking for.

Similarly, “squid” has a first match of the proxy server, and the first page of a Google search for “Apache” won’t turn up a single thing about Native Americans. Searching for “word” doesn’t turn up anything about 90’s cliche phrases, nor about nouns, adverbs, and the like. “Excel” isn’t about self-improvement, but spreadsheets.

It usually works out quite nicely for me, but I always feel bad for the people trying to learn amore about sunscreen or Native Americans, who end up arriving at boring technical sites.

Posting Jobs

There are many books written on how to write a good resume and an effective cover letter. Use lots of action words. Don’t say “I” too much…

There needs to be a book written on how to post job listings that actually discuss the responsibilities of the job. I would love to work with your energetic team at your fast-paced company, where I can plan resources and meet deadlines while demonstrating multidisciplinary knowledge and using my effective communication skills to interact with coworkers. But what, exactly, is the job?

Another pet peeve is the use of terms or acronyms that only have meaning within your company. I would probably love to work with the XYZ Department as part of my job, but you never explain what they do, and Google is no help.

Of course, it doesn’t make sense for me to be discriminating: I’m not going to refuse to apply to a place because their job posting was poorly-written. But really, it’s surprising how many of them are bad.

Någon ställa oss upp bomb.

Google Translate är ganska snyggt. Det har fler språk än Babelfish.

Es scheint zu sein ziemlich genau. Но затем снова, είναι όλοι οι έλληνες για μένα.

Edit: Jag fasta Unicode här! Nyckeln är att se till att du använder UTF-8 för din anslutning. Även om MySQL och PHP både stöd för Unicode, den anslutning jag öppnade till databasen var försumliga till en latinsk kodning och därmed get har slaktats på bergssluttningar. Denna situation har rättats till.

Today’s Poll Shows…

I like this. It’s quite obvious that both candidates would want to go, but I think it says a lot about them both that they made the visit together.

While his comments back in January were surely partisan, I think what happened today demonstrates that they both agree on what Obama said back in his NH concession speech: that his campaign “will never use 9/11 as a way to scare up votes, because it is not a tactic to win an election. It is a challenge that should unite America and the world against the common threats of the 21st century…”

Even though they’re surely going to spend the next two months tearing each other apart, I think it’s endearing that they came together today.

Iraq

President Bush has announced that he’s going to start withdrawing (some) troops from Iraq.

It’s not the point I’m trying to make here, but I want to being by noting that I think the linked article is very poorly written. My small amount of time writing for a college newspaper got me very acutely aware of word choice, so when they throw quotes around “success” and call Bush’s announcement a “boast,” I can’t help but think that they’re amateurs. But enough about that… I’m writing about Bush’s announcement to start a phased withdrawal of troops.

McCain and Palin have been seizing on Iraq, painting Obama’s plan to slowly remove troops from Iraq as surrender. And now President Bush is basically doing what the Democrats have been suggesting for a long time. Sure, it’s to a lesser extent, but still…

In more poor reporting, they mention, “Bush chose the measured drawdown in Iraq because he did not want to jeopardize recent security gains” right where I’d expect an explation of why Bush was pulling 8,000 troops out of Iraq. But that sentence makes no sense unless the emphasis was on “the measured drawdown” (as opposed to “the sweeping and drastic drawdown”), but that was never mentioned. President Bush explained that it was because of progress made in Iraq, and that we’re redeploying them to Afghanistan, which does make much more sense.

I just think it’s an awkward political move for the Republicans right now, as it’s pretty much doing what John McCain has maintained we shouldn’t do. But, of course, it’s welcome news in any event…

Left on Red

Has anyone else noticed that some intersections with traffic lights have a sign warning that taking a left on red is illegal? I’ve always wondered about those… Who, precisely, needed these signs? Did people just assume that, since right-on-red is usually okay, that they could apply the same logic to left-on-red and merrily cut across multiple lanes of oncoming traffic?

More on Context

McCain has a new ad out, saying that Obama’s only accomplishment on education was a bill to teach sex ed in kindergarten. This sounds preposterous.

Well, it is. The law Obama supported (in the Illinois state senate) was to teach kids “age appropriate” information about avoiding sexual abuse and sexual predators. It’s not teaching kindergartners how to have sex, or even what it is—it was a bill to teach kindergartners that it was okay for them to stand up to people trying to grope them.

The Obama campaign responded quickly, saying, “It is shameful and downright perverse for the McCain campaign to use a bill that was written to protect young children from sexual predators as a recycled and discredited political attack against a father of two young girls…”

They got in a jab of their own, too: “Last week, John McCain told Time magazine he couldn’t define what honor was.  Now we know why,” said Obama campaign spokesman Bill Burton.” It’s referencing his bizarre interview with Time Magazine, which Time describes as “prickly.” (That might be putting it lightly.)

Vote for who you will, but for the love of God, don’t fall for the “Obama wants to teach sex to kindergartners” line.

EDIT: FactCheck has it debunked, too. It seems the ad was even more full of fail than I assumed… Obama didn’t even cosponsor the bill (which didn’t even institute “sex ed” in kindergarten); he merely voted for it. Obama did sponsor/cosponsor bills to actually improve schools, but McCain’s ad doesn’t mention those. And the quotes bashing Obama? All three are ‘technically accurate’ quotes, but two are from articles that actually painted Obama more positively than McCain.

The third, the Chicago Tribune editorial? It apparently wasn’t a Chicago Tribune editorial, and was written by a conservative journalist. And what does the conservative columnist think of his piece being used in the ad? Much like Paris Hilton and Heart and Jackson Browne, he’s not terribly happy about it: “But really, I don’t mind at all when McCain cites something I wrote praising him… But the ad itself… insults our intelligence by expecting us to believe that Obama thinks kindergarteners should be taught how to use condoms before they’re taught to read…”

And he ends, as I will, by adding, “This commercial doesn’t tell us much about Obama. But it sure provides an education about McCain.”

Simple Spam Prevention

Alright, it’s a busy day moving furniture and whatnot, so I’ve just been popping in periodically when I need a break. But now it’s lunchtime, which means I can justify to myself taking some time for things here.

I’ve been running a mailserver on ttwagner.com / n1zyy.com for a few days now. (Quick note: those of you who have e-mail addresses for me there, don’t use them yet! More on this in a minute.)

I posted a really technical post the other day talking about it, but wanted to post something in English today, and I have updated stats, too. 🙂 In a strange way, I’m lucky to have a whole bunch of e-mail addresses that have never existed, yet get a lot of spam, because they allow me to test anti-spam measures with no risk.

I’m still tweaking things, and haven’t had time to do any detailed stats anyway, but I’d like to give a quick mention to three different techniques (at the mailserver level) that have been extremely effective for me.

  • Greylisting essentially has my mailserver throw a “temporary failure” at mailservers that connect that it hasn’t seen before. This exploits the fact that most spammers don’t use “real” mailservers, but instead use programs designed to connect directly and pump out spam. Real mailservers have been designed, for decades, so that these temporary problems are no big deal, and will just try again in 15 minutes. Most spam programs, though, don’t. My stats are probably skewed, but out of over 500 instances of servers being greylisted, one spam message has slipped through. That’s what, 99.8% accuracy? (The downside is that legitimate mail is delayed, since when it’s delivered the first time, we say, “Sorry, we’re busy. Try again later,” and speed of delivery is at the mercy of the mailserver… Policyd has a neat feature, though, allowing greylisting to be toggled per-user, so it needn’t be a hassle.)
  • Mailservers talk to each other using the SMTP protocol. (Err, the “P” stands for “Protocol,” but “The SMTP” or “The SMT Protocol” sound absurd.) The mailserver starting the connection begins by saying “HELO” and giving its hostname. This is actually fairly pointless, as you can see the IP/hostname that the connection is coming from at the TCP level, so you’d be a fool to trust the HELO string. But it happens that this is a useful thing to look at in spam. For one, a handful of spam programs apparently don’t even include the HELO string. They’re instantly classified as spam. (Though the server lets them send some headers for logging.) A few other spammers will try saying “HELO” with the name of the server they’re connecting to. Insta-spam. But where I catch the most people is by requiring an FQDN in the HELO string. (Which is slightly risky, as a legitimate mailserver could send a bad HELO string.) An “FQDN” is a fully-qualified domain name: think “complete name.” www.example.com is complete; you can type it and get there. “www” is not. Since the log file reset Sunday morning, my mailserver has refused 268 connections because of a bad HELO string. I try to sift through them and check; every single one is egregious spam.
  • Spamtraps are my new favorite thing. I have a handful of addresses that do not exist, and have never existed, yet get a lot of spam. Periodically I trawl the logfiles to extract a list of non-existent addresses that people tried to send mail to, and list the most common ones as spamtraps. Sending mail to a spamtrap gets your mailserver blacklisted. Note that I’m currently using a handful of addresses that were once good, but that haven’t received legitimate mail in years… So don’t go sending me mail at years-old addresses just because you read that the n1zyy.com mailserver is online… It may end badly. 😉 (Actually, this leads to an interesting observation: spamtraps could be exploited to get “good” mailservers blacklisted! If you know that xyz@example.com is a “spamtrap,” you could send it mail from, say, GMail and Hotmail, and get both blacklisted…)

What really interests me, though, is that most of these filters are almost absurdly simple and easy to defeat. Spammers need to take 30 seconds to set up an “FQDN” to use. (I don’t even check that it actually matches to your IP, only that it makes the least bit of sense.) Greylisting has been in use for a few years, so there’s no reason that spammers couldn’t have adapted with more intelligent software that will attempt to re-deliver spam which gets a temporary failure.

And yet, as the stats show, I’ve had very good luck so far: way, way more than 99% of spam is rejected before it ever reaches the “complex” anti-spam filters that actually look at the message body.