p0f for spam detection?

I posted a while ago about p0f, a neat tool that looks at packet structure to determine the operating system speaking to you from a given IP. (It seems like the tool hasn’t been updated in a while, which is a shame.)

I’ve been running it for a while, and log p0f strings for all incoming connections to port 25, i.e. every mailserver trying to connect to n1zyy.com or ttwagner.com. You can see it on the 100 blacklisted IPs page here, showing the IP, country, and p0f string for each connection. (I have it configured to not log ‘guesses,’ which explains why some are blank.)

I’ve noticed that the vast majority of entries are coming from “Windows 2000 SP4, XP SP1+” as an operating system. (This is the IP that’s connecting to my mailserver, i.e. the outgoing mailserver’s operating system. This has nothing to do with people using an ordinary mail client on Windows 2000 or Windows XP.) This doesn’t surprise me a lot: most spam is sent from virus-infected desktop computers these days, and people running old versions are much more likely to get infected than someone who keeps up to date with security updates. (I will caution that p0f isn’t 100% accurate, especially as it hasn’t had a definition file released since pre-Vista.) The other aspect of this is that very few professionals would run a mail server on Windows XP, though there could be legacy systems running on Windows 2000. (Although, man, they’re behind the times!) So if we see an incoming connection from this OS string, we have a fairly good idea that it’s either someone’s desktop or a mailserver run by someone who never, ever upgrades anything.

I’ve posted before about how I’ve found that blacklists are usually very good at blocking spam, but they seem to get better hours or days after the spam has been sent, so what I like to look at in evaluating a DNSBL is how it fares for mail as it’s being delivered to me, not how it looks hours later in a test. The most recent connection, for example, is a “Windows 2000 SP4, XP SP1+” machine in China, but it only pops up in one fairly obscure blacklist. It would have gotten through if I relied on DNSBLs. (Well, except that it e-mailed a spam trap and/or used my IP address as a HELO string, so it got auto-banned…)

What I’ve been interested in for a while is in whipping up a Postfix policy plugin that would do scoring based on multiple factors. This would let me ensure that certain patterns would increase a message’s spamminess score, but that certain things couldn’t tip the scale on their own. I never liked the idea of banning foreign countries, even if most spam comes from China. (I suspect something is wrong with that chart, actually…) But for someone who doesn’t interact with anyone from China, it’s more probable that mail from China is spam. So we can score them a little more highly. And based on what I’m seeing in the mail logs, we would have very good results if we did the same for hosts connecting that ran desktop Windows versions.

Of course, I’m not yet ready to pronounce this a bulletproof idea. For one, I haven’t studied how p0f treats connections from legitimate Exchange servers. It doesn’t seem to show connections from Vista properly, for example, so I worry that such a block might inadvertently snare legitimate Windows server OSs. Plus, the only way I’m noticing this right now is by looking at mail that’s already getting caught at spam; mail that gets accepted doesn’t get listed. More directly, “Most spam is sent from Windows XP and Windows 2000” doesn’t necessarily mean, “Only spammers use Windows XP and Windows 2000 on their outgoing mail servers.”

Matt's Blog

My virtual /home on the web

p0f for spam detection?

Leave a Reply Cancel reply