Spammers vs Free Speech: Spamassassin and Amavis

Monday, March 19, 2007

Spamassassin and Amavis

After resisting for years, I've taken the second step down the slippery slope of content filtering. My first lines of spam defense will continue to be source blocking and SMTP mistake-catching. But that only gets you so far. The criminals who break into legitimate web hosts get through. The only way to get them is analyze the messages.

The first step was Postfix' header_checks and body_checks. They stop some of the most obvious stuff. But Postfix warns you not to get carried away, and you can't combine different checks. "If it says it's from Paypal but it wasn't sent from their IP space" is too complex.

The second step is a big one. We set up a special local server, Amavis-new, that Postfix can consult as it decides whether to accept a message. This evaluation has to happen fast, while the sender (client) is waiting for the receiver's (server's) decision. Once you accept the message into your delivery queue, it's too late to refuse it. You can't return it, because once it's yours you don't really know where it came from. The client is long-gone, and the "From:" address in spam is always a lie.

Amavis-new's biggest module is Spamassassin, a collection of thousands of little "tests" that can be intricately selected, combined, and scored. Amavis-new considers Spamassassin's opinion of the message and advises Postfix to refuse the spammiest. It leaves marks on the messages it accepts, so that the final recipients can sort them as they're delivered. A very cool contraption. Each part is carefully and independently maintained.

It's software for professionals; the "documentation" is great reference material but scant tutorial. And there are lots of ways to put the pieces together. The maintainers of each piece have rather little to say about all those ways. They're responsible for their respective pieces, but you're responsible for your contraption. I have the O'Reilly Spamassassin book and the No Starch Postfix book (they're both pretty good) and I still had to ask for help. Someone on the Debian-ISPs list sent me exactly the clue I needed, immediately. Somewhere in Amavis-new's documentation they tell you that amavisd will only mark up messages destined for "local" recipients. That's what the @local_domains_maps variable is about. It's in the sample config file.

@local_domains_maps list of lookup tables are used in deciding whether a
recipient is local or not, or in other words, if the message is outgoing
or not. This affects inserting spam-related headers for local recipients,
limiting recipient virus notifications (if enabled) to local recipients,
in deciding if address extension may be appended, and in SQL lookups
for non-fqdn addresses. Set it up correctly if you need features
that rely on this setting (or just leave empty otherwise).

Well that clears everything up. Spamassassin itself is distributed through the amazing Comprehensive Perl Archive Network. Perl just gets it for you. Even though Perl is nearly as efficient doing complicated things as you would be in a lower-level compiled language, there are a lot of tests and Spamassassin is big and slow. You can run Postfix on any old PC, but you need a modern CPU and lots of RAM to run this contraption. I'm going through disabling the tests that duplicate things I already did in Postfix.

# posted by 0.3E9m/s @ 1:46 PM

Comments: Post a Comment

<< Home

Spammers vs Free Speech

Monday, March 19, 2007

Spamassassin and Amavis

About Me

Links

archives