In Part 1 I set up spam filtering for the Postfix mail server with SpamAssassin, according to the Debian SpamAssassin default instructions for doing so.
I don't want to waste any more resources than absolutely necessary on spam. "Resources" includes processing time and disk space, but also human attention, like the time it takes to check through your spam folder if you suspect something may have ended up there. Therefore, I'd like to keep as much spam as possible off the system entirely.
If I receive a spam email, I have 3 options for not keeping it.
The simplest, but least polite option, is simply to delete any email that gets a high enough spam score. However, email is meant to be a reliable delivery service, as described in the email standards document RFC 5321 §6:
When the receiver-SMTP accepts a piece of mail (by sending a "250 OK" message in response to DATA), it is accepting responsibility for delivering or relaying the message. It must take this responsibility seriously. It MUST NOT lose the message for frivolous reasons, such as because the host later crashes or because of a predictable resource shortage.
[...]
As discussed in Section 7.8 and Section 7.9 below, dropping mail without notification of the sender is permitted in practice. However, it is extremely dangerous and violates a long tradition and community expectations that mail is either delivered or returned. If silent message-dropping is misused, it could easily undermine confidence in the reliability of the Internet's mail systems.
So, don't do that.
Again from standards doc RFC 5321 §6, linked above:
Utility and predictability of the Internet mail system requires that messages that can be delivered should be delivered, regardless of any syntax or other faults associated with those messages and regardless of their content. If they cannot be delivered, and cannot be rejected by the SMTP server during the SMTP transaction, they should be "bounced" (returned with non-delivery notification messages) as described above.
If you've ever received an email titled "Delivery failure" or similar, this is a bounce message.
The trouble with bounces is that spammers often use hijacked computers to send their emails, and lie about the return address. So if you've ever received an email titled "Delivery failure" for an email you never sent, this is because a spammer tried to send an email to someone, and put your email address as the sender. When the recipient's system bounced the email, the bounce notification got delivered to you.
As the Debian SpamAssassin Notes point out:
The problem is, spammers (and viruses) routinely forge the from address on the envelope. This means that if there is a bounce generated, it will go to this address, which can be randomly generated, or worse, an innocent third party.
Therefore, it is very important that your system doesn't generate a bounce.
Similarly, from the Postfix content filter instructions:
NOTE: in this time of mail worms and spam, it is a BAD IDEA to send known viruses or spam back to the sender, because that address is likely to be forged.
So, don't do that either.
The third option is to reject the email without accepting it in the first place.
The way email works is that the sender's system connects to the recipient's system, and passes all the email data to it. As described above, at the end of this process the recipient's system says "OK", and at that point it takes responsibility for the email. However, it also has the option of saying "No" for some reason. For example, if the email is too large, or if the recipient's mailbox has reached its quota limit. (Remember mailbox quotas? They still exist in some places.)
Then it's the sending system's responsibility to handle the problem of the email not being delivered - and it knows where the email really came from.
This is the best way to handle emails you want to keep off your system.
The problem is that, as I was trying to figure out how to make the content filter tell Postfix to reject emails with a high enough spam score, I learned that Postfix's content filters are applied after the email has already been accepted by the mail server. At that point, it's too late to reject it. Therefore, I need to find a different way to configure SpamAssassin to work with Postfix.
A milter (mail filter) is a type of external program that some mail servers can use to implement custom filtering mechanisms. Postfix has support for milters, and can use them early enough in the process pipeline that they can be used to reject emails.
Further, there is a milter wrapper for SpamAssassin, spamsass-milter, that allows SpamAssassin to be used as one. Great!
...except for the problem of too many levels of indirection.
One thing about how milters work is that they cannot be run on-demand, but need to run as a daemon that the mail server connects to.
You might remember from Part 1 that SpamAssassin is also best run as a daemon (spamd
) so that its slow startup can be done only once, ahead of time.
Given this, it would be reasonable to expect that spamass-milter
would integrate SpamAssassin into itself to absorb this cost.
Sadly, reasonable expectations often appear to be something of an unattainable luxury in this kind of endeavour.
Therefore, to use spamass-milter, you need to have both the spamass-milter
and spamd
daemons running all the time.
Worse, spamass-milter
can't even connect directly to spamd
itself, but needs to run the external spamc
program to do so on it's behalf.
So, rather than just have the mail server:
Instead, it:
I realise that part of this is just a consequence of the way milters work, but still, it seems like a lot more moving parts than should be necessary.
Aesthetic considerations about the elegance of the solution aside, once the milter was set up as described in the Debian SpamAssassin Postfix Milter instructions, the system was classifying spam as before, but rejecting any sufficiently spammy emails before they were accepted. Hurrah!
After the work it took to get here, that's good enough for now.
But... we're not done yet. Stick around for Part 3! Well, don't stick around - given how long it took me to get this post out you might be in for a bit of a wait. Go do something else, and maybe check back in a week or so. There might even be a Part 2b first.
posted at: 09:57 | path: / | permanent link to this entry