Simple Spam Prevention

Alright, it’s a busy day moving furniture and whatnot, so I’ve just been popping in periodically when I need a break. But now it’s lunchtime, which means I can justify to myself taking some time for things here.

I’ve been running a mailserver on ttwagner.com / n1zyy.com for a few days now. (Quick note: those of you who have e-mail addresses for me there, don’t use them yet! More on this in a minute.)

I posted a really technical post the other day talking about it, but wanted to post something in English today, and I have updated stats, too. 🙂 In a strange way, I’m lucky to have a whole bunch of e-mail addresses that have never existed, yet get a lot of spam, because they allow me to test anti-spam measures with no risk.

I’m still tweaking things, and haven’t had time to do any detailed stats anyway, but I’d like to give a quick mention to three different techniques (at the mailserver level) that have been extremely effective for me.

Greylisting essentially has my mailserver throw a “temporary failure” at mailservers that connect that it hasn’t seen before. This exploits the fact that most spammers don’t use “real” mailservers, but instead use programs designed to connect directly and pump out spam. Real mailservers have been designed, for decades, so that these temporary problems are no big deal, and will just try again in 15 minutes. Most spam programs, though, don’t. My stats are probably skewed, but out of over 500 instances of servers being greylisted, one spam message has slipped through. That’s what, 99.8% accuracy? (The downside is that legitimate mail is delayed, since when it’s delivered the first time, we say, “Sorry, we’re busy. Try again later,” and speed of delivery is at the mercy of the mailserver… Policyd has a neat feature, though, allowing greylisting to be toggled per-user, so it needn’t be a hassle.)
Mailservers talk to each other using the SMTP protocol. (Err, the “P” stands for “Protocol,” but “The SMTP” or “The SMT Protocol” sound absurd.) The mailserver starting the connection begins by saying “HELO” and giving its hostname. This is actually fairly pointless, as you can see the IP/hostname that the connection is coming from at the TCP level, so you’d be a fool to trust the HELO string. But it happens that this is a useful thing to look at in spam. For one, a handful of spam programs apparently don’t even include the HELO string. They’re instantly classified as spam. (Though the server lets them send some headers for logging.) A few other spammers will try saying “HELO” with the name of the server they’re connecting to. Insta-spam. But where I catch the most people is by requiring an FQDN in the HELO string. (Which is slightly risky, as a legitimate mailserver could send a bad HELO string.) An “FQDN” is a fully-qualified domain name: think “complete name.” www.example.com is complete; you can type it and get there. “www” is not. Since the log file reset Sunday morning, my mailserver has refused 268 connections because of a bad HELO string. I try to sift through them and check; every single one is egregious spam.
Spamtraps are my new favorite thing. I have a handful of addresses that do not exist, and have never existed, yet get a lot of spam. Periodically I trawl the logfiles to extract a list of non-existent addresses that people tried to send mail to, and list the most common ones as spamtraps. Sending mail to a spamtrap gets your mailserver blacklisted. Note that I’m currently using a handful of addresses that were once good, but that haven’t received legitimate mail in years… So don’t go sending me mail at years-old addresses just because you read that the n1zyy.com mailserver is online… It may end badly. 😉 (Actually, this leads to an interesting observation: spamtraps could be exploited to get “good” mailservers blacklisted! If you know that xyz@example.com is a “spamtrap,” you could send it mail from, say, GMail and Hotmail, and get both blacklisted…)

What really interests me, though, is that most of these filters are almost absurdly simple and easy to defeat. Spammers need to take 30 seconds to set up an “FQDN” to use. (I don’t even check that it actually matches to your IP, only that it makes the least bit of sense.) Greylisting has been in use for a few years, so there’s no reason that spammers couldn’t have adapted with more intelligent software that will attempt to re-deliver spam which gets a temporary failure.

And yet, as the stats show, I’ve had very good luck so far: way, way more than 99% of spam is rejected before it ever reaches the “complex” anti-spam filters that actually look at the message body.

Matt's Blog

My virtual /home on the web

One thought on “Simple Spam Prevention”

Leave a Reply Cancel reply