A Rant about Ads

I’ve posted in the past about my ambivalence about blocking ads online. On one hand, I installed AdBlock Plus because internet advertising has gotten really obnoxious, with sites seemingly competing to see who can make the most irritating ads and who can place the most ads in places that obscure content. The Internet got a lot cleaner when I starting blocking ads.

On the other hand, I recognize the importance of advertising on the Internet. Free sites with ads have an implicit contract with the viewer: view the ads on the page, maybe click on them if you’re interested, and we’ll let you use our site for free. An individual can block ads, and it’s somewhat like stealing cable — on your own, it’s insignificant and almost a victimless crime. But in the long run, it’s happening a lot and it endangers those sites’ abilities to support themselves with ads. (I work for an ad-supported site, so I’m biased.)

I went a few months with AdBlock Plus disabled entirely. It went pretty smoothly, but most of the sites I visit frequently are like my own company: we don’t accept invasive ads. Ads the make sound, cover content, or gasp open popups are never allowed. (We also reserve the right to reject “sleazy” ads.) Recently, I’ve found a few sites with popups. (I’m not sure, how “Block popups” was disabled in Firefox. I just fixed that.)

It may be possible to offer fine-grained control of blocking ads, but I really can’t be bothered. So here’s my new system: I browse with AdBlock Plus disabled. But if I go to a site that I know has invasive ads — popups, noisy ads, content-obscuring ads, etc. — I turn AdBlock Plus on, which blocks all ads. The sites I frequent also have AdBlock plus set to not run, so that even if it’s on, I still view the ads.

A quick note, by the way – “I don’t click ads, so they’re not losing any revenue anyway” is incorrect. A lot of sites get paid for impressions (“CPM”), not clicks (“CPC”), so whether the ads are clicked on or not is irrelevant. If you block ads, you are depriving the site of revenue. Hence my compromise: I’ll view your ads in return for the free content you provide, unless your ads are obnoxious.

Quick ‘n Dirty Spam Rejection with policyd-weight

I’ve blogged about DNSBLs before. They’re DNS-based blacklists of spammer IPs. (To see if 1.2.3.4 was listed in blacklist.example.com, you’d do a DNS lookup for 4.3.2.1.blacklist.example.com. If you get an IP, usually 127.0.0.2, back, it’s listed. If you get an NXDOMAIN, it’s not listed.) Some lists are abysmal, but I found some that are very accurate. I never loved DNSBLs, mainly because you cede way too much control to DNSBL operators — if they list an IP, your mailserver will refuse mail from them. Sometimes people running DNSBLs are vindictive, and other times they’re clueless, so it’s not at all unheard of for legitimate IPs to wind up in blacklists.

I set up policyd-weight on my new mailserver a little bit ago. The reason I’m so crazy about policyd-weight is that it queries multiple DNSBLs and computes a score based. I have it configured so that someone needs to be listed in multiple blacklists before anything happens, so one erroneous listing won’t do any harm.

Over time, I’ve been logging IPs of people emailing my spamtraps, and looking them up in various DNSBLs when they were listed. (Whenever I poked around there, I’d also look up the IPs of mailservers that recently sent me desired mail, and check those; any blacklist listing any non-spam server was summarily removed.) So I set up policyd-weight with this configuration file:

   @dnsbl_score = (
#    HOST,                    HIT SCORE,  MISS SCORE,  LOG NAME
    'pbl.spamhaus.org',       3.25,          0,        'DYN_PBL_SPAMHAUS',
    'sbl-xbl.spamhaus.org',   4.35,          0,        'SBL_XBL_SPAMHAUS',
    'bl.spamcop.net',         3.75,          0,        'SPAMCOP',
    'dnsbl.njabl.org',        3.25,          0,        'BL_NJABL',
    'ix.dnsbl.manitu.net',    4.35,          0,        'IX_MANITU',
    'psbl.surriel.com',       4.25,          0,        'PSBL_SURRIEL',
    'list.dnswl.org',         -100,          0,        'DNSWL_PASS',
    'ubl.unsubscore.com',     3.50,          0,        'UNSUBSCORE',
    'dnsbl-2.uceprotect.net', 2.00,          0,        'UCEPROTECT_2',
    'b.barracudacentral.org', 4.00,          0,        'BARRACUDA',
    'dnsbl.sorbs.net',        2.00,          0,        'SORBS',
    'dyna.spamrats.com',      2.00,          0,        'SPAMRATS_DYNA',
    'bl.spameatingmonkey.net',2.00,          0,         'SEM_BL',
    'bl.mailspike.net',       3.00,          0,        'MAILSPIKE-BLACK',
    'wl.mailspike.net',       -100,          0,        'MAILSPIKE-WHITE'
);

   $MAXDNSBLHITS  = 5;  # If Client IP is listed in MORE
                        # DNSBLS than this var, it gets
                        # REJECTed immediately -- set high due to whitelists on list too

   $MAXDNSBLSCORE = 9;  # alternatively, if the score of
                        # DNSBLs is ABOVE this
                        # level, reject immediately

   $MAXDNSBLMSG   = '550 Your MTA is listed in too many DNSBLs';

It’s worth mentioning that this isn’t even a good configuration. For one, the whitelist (-100 points if you’re listed) should be up top, because policyd-weight seems to stop processing DNSBLs once the threshold (a score of 9, or listing in 5 blacklists) is hit. That would also argue that you’d put your fastest / most accurate blacklists up front. Spamhaus, SpamCop, Manitu, Surriel, and Barracuda Central are all first-rate; I’d move them to the top, right after the whitelist check.

You need 9 points to be listed. I thought this was conservative, and might match maybe a quarter of my spam. If you hit the highest-scoring DNSBLs, you’d still need to be in three DNSBLs before your mail was rejected — you need 9 points and the highest is 4.35. You’ll also note that, towards the end, I threw in some 2-pointers. These are lists that can be a little too aggressive, but they’re safe.

I pointed a couple of my less-used domains’ MX records to this setup. They’re ones that get tons of spam but are either used not at all, or ones that have mailboxes that, in practice, don’t get much mail, and that could afford to lose a few messages to a bad configuration. The results?

I’ve rejected mail from 150 different IPs today alone. And here’s the interesting thing: 100% of spam has been rejected, with zero false positives. This is much better than I expected. I made mailboxes for some spamtraps I have, and not a single one has any mail. I sent myself email from every legitimate service I can think of, and it went right through. And actually, it not only went right through, but it came in with a negative score — policyd-weight gave a “bonus” to people with good configurations, like if a DNS lookup for their HELO string actually matched the connecting IP. And mail from GMail and Apple had the -100 points from being in dnswl.org’s list, too.

My results surely aren’t typical of real-world settings. I’d expect to eventually have some spam slip through the cracks, and I’m a little uneasy about all the checks for HELO matches, etc., that are performed, if only because I haven’t taken the time to fully understand them. But based on a week’s worth of spam to my low-traffic mailserver, this configuration is batting a thousand. I’d planned to set up postfix-policyd to do greylists / spamtraps / blacklists / HELO checks, but thus far, and I’d planned to set up and tune SpamAssassin for mail that was ultimately accepted. And I still will someday. But right now, it’d be pointless. (It’s also worth mentioning that development of policyd-weight stopped two years ago.) But if you’re getting a lot of spam, give policyd-weight a look. It’s worked better than I imagined was possible.

Polarizers

I’ve had a 55mm* polarizing filter for a while now. I bought it for my old camera, but it conveniently fit my 55-200mm lens when I went to an SLR system. The 55-200mm is my least-used lens, though; my 50mm f/1.4 prime and the 18-50mm wide-angle both see more use. Both of them need 58mm filters.

  • In discussing filters, 55mm refers to the diameter of the mount on the front of the lens, not the focal length of the lens.

I was at the mall yesterday, so I impulse-bought a 58mm circular polarizer. It just so happens that yesterday and today were, bright, clear days, so I got to put it to use.

Magnolia Tree

The rich blue sky closely matches what I saw, but it wouldn’t have been possible to capture without the aid of the polarizer. Similarly, I’d have expected the details on the tree to have been lost somewhat.

Here are a couple less-awesome photos that illustrate the benefits. First, here’s a not-terribly-inspiring shot of a puddle Kyle and I passed while going for a walk today:

title=”IMG_1179 by n1zyy, on Flickr”>IMG_1179

Here’s the same shot with the polarizer adjusted to block the glare, allowing you to see into the puddle:

title=”IMG_1180 by n1zyy, on Flickr”>IMG_1180

In this particular case I think the reflected trees in the first shot actually help the photo, but this shot of a puddle isn’t going to end up on anyone’s wall either way.

Here’s a more subtle effect of polarizers:

title=”IMG_1126 by n1zyy, on Flickr”>IMG_1126

This is a shot of some power/phone lines being overgrown with vines. (Yikes!) Here’s a very similar shot, in which I correctly-adjusted the filter:

title=”IMG_1125 by n1zyy, on Flickr”>IMG_1125

The sky looks nominally better, but the real gain is subtle — it’s in the details on the vines and the power lines. Look at the bottom line, for example: with the polarizer, you’re able to see the silver (silver-colored, at least) wire that wraps its way around, whereas it’s mostly lost in the first photo. Again, this shot won’t win any awards, and using a polarizer to capture details in power lines isn’t exactly exciting. But the sort of small gains can be far-reaching and are worth it, I think.

I don’t have a before-and-after on this one, but this shows another great benefit of polarizers:

title=”Trees by n1zyy, on Flickr”>Trees

The foliage is vibrant and saturated, whereas it’s much more washed out without it.

Polarizers do have some downsides. One (which can also be a benefit in some cases!) is that you lose some light. Any sort of filter always loses some, but the whole point of polarizers is to reject certain types of light. Don’t use a polarizer if you’re shooting in low-light… Also, they can introduce some distortion; this isn’t just polarizers, but anything you stick in front of your lens. ($400 lens, $20 polarizer… D’oh!) Another pet peeve I’ve noted with filters is that under just the wrong situation, they can pick up glare themselves, as the filter itself receives a ton of sunlight.

Given that you can pick up a mediocre-enough polarizer for $20 or so, you probably owe it to yourself to pick one up if you do much outdoor photography.

Spamhaus adds a RHSBL

I’ve been tinkering in my spare time with setting up a killer mailserver; something that will hopefully fend off spam with great accuracy, yet never at the expense rejecting legitimate mail. One great weapon in the fight has always been Spamhaus, which runs a series of extremely high-quality blacklists.

It turns out that they’ve recently launched their DBL, or Domain Block List, a form of RHSBL, allowing you to query domains for spamminess instead of looking up IPs. There’ve been a bunch of lists offering this, but DBL should bring the usual Spamhaus professionalism to the table. Yay!

Programming Tip

This is probably so ludicrously obvious that it goes without saying but, in my experience, it really needs to be said:

When building your architecture, you should design it so that things you do every day are easy.

For example, if you have an often-used object, don’t split it into two halves and require that you join on them. There are lots of arguments for not having a 50-column table, and there was a clear distinction between the public side of a user and the private site of a user, but splitting one simple concept in half has caused lots and lots of pain. If it’s a core “thing” on your site, make it easy to manipulate. For example, the question, “How many users signed up today?” should not require joining two tables, nor should finding out whether a user on the site is active or not.

When complexity doesn’t come under the guise of optimization, it sometimes creeps in as permitting new capabilities. On another product we moved the “email” column out and created an Email object, allowing users to have multiple emails. In more than a year’s time, nothing has ever been done to actually assign additional emails. All it means is that instead of User.find_by_email(‘abc@example.com’), I need to do Email.find_by_email(‘abc@example.com).user.

And here’s what got me annoyed enough to start this post: you shouldn’t need lots of conditions to show your base case of things. Right now I’m working on a bit of code to pull back all the posts on the blogs. It seems I need to check a whole bunch of columns: WHERE post_status=’publish’ AND post_password IS NULL OR post_password = ” AND post_date < NOW()

Sorry, but that’s dumb. Sure, you can bundle it away and give a nice simple method that does all that stuff for you, but hiding insane designs behind simple methods doesn’t actually fix them. It all comes down to the premise I began with:

When building your architecture, you should design it so that things you do every day are easy.

Displaying all the posts on a blog is a thing you do every day. So is getting the number of members who signed up today. So is looking up a user by email. So don’t make them hard.

Things Every Geek Needs

I tend to like comparing working with computers to working with cars. I’m not sure why. I think it probably has to do with the fact that everyone has a vague idea of what mechanics do, but computers are often seen as a black magic.

So here’s a list of things that I’ve found handy to have in my “garage,” because you never know when you’re going to need them:

  • Extra power cables. USB, power, and Ethernet, at least.
  • Extra USB peripherals, especially a keyboard.
  • A USB CD drive.
  • A USB-to-ATA and USB-to-SATA adapter. You use it once and it’s instantly worth it. I have a bunch of old hard drives, and I can just throw this little adapter into the back of the bare drive, use the included power adapter, and I’ve got a “USB” hard drive made out of an internal drive. Don’t consider buying one without SATA support or it’ll be obsolete.
  • A copy of the Ultimate Boot CD. It’s ancient (mostly DOS-based), so it sometimes has a hard time seeing a 2TB SATA disk connected off of USB or going through an SAS controller, but kind of like the USB-to-(S)ATA adapter, if you use it once you’ll sing its praises forever. It’s bailed me out repeatedly, and does everything from testing drives to checking RAM to doing CPU load-testing… Oh, and I recovered (!) a totally-destroyed boot sector after a botched OS upgrade once. I was flipping out trying to figure out if I’d managed to screw things up for good, and I ran one little tool that just fixed everything. I believe it has some nifty utilities for things like resetting Windows passwords, too, though it’s been ages since I used them and I’d be surprised if they worked on modern systems.
  • A Linux live CD. I like Ubuntu, just because it’s easy and works on most everything. (Knoppix is an old favorite too.) It’s not for installing over things (although that’s cool too…); it’s for rescuing data. An Ubuntu Live CD will speak many more file formats than Windows. Boot a messed-up machine from this, and use your USB-to-(S)ATA adapter to copy files over to an external disk… And since it boots to a full OS and not just a rescue shell, you can do things let get it to use your wireless NIC so you can use Firefox to look up information while you’re working. (And an added bonus: use it to verify whether your NIC is bad or it’s just your OS install that won’t see it… Unless, of course, Ubuntu’s Live CD doesn’t support it, but it’s 3 for 3 right now.)
  • A set of screwdrivers. Big and small. Mostly small.
  • Some Torx screwdrivers. I held out for a long time, and eventually bought a cheap set at Radio Shack. I wish I’d done it much sooner. It turns out a lot of things use Torx screwdrivers.
  • A whole, unformatted hard drive with huge capacity. Back everything up if things get scary, whether it’s because a drive is clicking or because you’re doing a major OS upgrade. It’s really worth the money to keep a 1TB+ drive that you never use. (And with the USB-to-(S)ATA adapter, you can get a cheaper internal drive, even.)

I used to keep a thumb drive with handy Windows utilities, too, but I haven’t done much with Windows lately. It had things like a bunch of SysInternals tools, CCleaner, Defraggler, and Recuva… Portable Firefox, and trial installers for anti-virus software. Revo Uninstaller. Back in the day I had Ad-Aware, too; not sure if it’s useful these days or not. Ninite is cool but not really meant for a thumb-drive. Actually, pretty much everything in Lifehacker’s How to Fix Your Relatives’ Terrible Computer is really good. Photorec is super-obscure and not easy to use for non-geeks, but it does its job amazingly well.