Bans for Fun & Profit

The way I use this server gives me a luxury that bigger sites don’t: my visitors come from a select range, and I don’t have to worry much about blocking people erroneously. Therefore I can be quite aggressive in blocking IPs. /etc/hosts.deny is my new favorite file.

When I moderate comments here, I have a few choices… I can approve it, delete it, or mark it as spam. I never got what marking it as spam did… Apparently it doesn’t do much but set a ‘spam’ bit. (I’d hoped it trained Bayesian filters or something, but no such luck.) But what it does do is make it super-simple to construct an SQL query to pull out all the IPs that have posted spam. Add a little more and you get just the IPs this month that had posts flagged as spam. And you drop them in /etc/hosts.deny.

But then I was watching the system log file, and noticed lots of spam coming in. I’m not running much of a mailserver, so most addresses are bouncing. (Especially since they’re spamming addresses that have never existed?)

This is good news, though, for the IP-banhappy out there. Here’s my latest concoction:

grep NOQUEUE /var/log/messages | awk '{print $10}' | \
sed "s/[/ /g" | sed "s/]/ /g" | awk '{print "ALL", $2}' | \
sort | uniq -c | sort | tail

In a nutshell, we look for “NOQUEUE” in the log files, pull out the 10th column (IP), split out the junk so it’s just a numeric IP, sort it, weed out the dupes with uniq and pass it the -c flag, which has it count the number of times each line occurs, and then we sort that, so that the list is now sorted by the number of bounces. It defaults to ascending order, so that the top of the file is all people who’ve only e-mailed one invalid address. So the ‘juicy’ part is the end of the file. So we pipe it to tail, which, by default, shows the last ten lines. So the output looks like:

      5 ALL
      5 ALL
      6 ALL
      7 ALL
      8 ALL
      8 ALL
     10 ALL
     15 ALL
     17 ALL
     17 ALL

You could use a little more magic to automatically add the second and third columns to /etc/hosts.deny, but I prefer to do it this way… The reason is that sometimes (not in this example) you’ll see posts from a range of similar IPs. It’s more of a judgment call where you draw the line, so I like to give it the once-over.

One thought on “Bans for Fun & Profit

  1. Point of clarification: I’m not sure that NOQUEUE *necessarily* indicates spam, or even that the mail was rejected. The better way would be to search for a longer string. But being a pragmatist, 100% of the NOQUEUE entries in my log are from non-existent addresses.

Leave a Reply

Your email address will not be published. Required fields are marked *