« Single Point Failures | Main | Happy Birthday Special Relativity »

Thursday, June 30, 2005

Grey List Experiment: Spam Grand Slam

For the last day and a half I've been experimenting with Jef Poskanzer's Graymilter mail filter. This filter, using a technique called "greylisting" originally described by Evan Harris in 2003, exploits the fact that most spam-sending robots do not fully implement SMTP--in particular, they fail to handle a transient failure status (450) and re-send the mail later as required by the protocol.

Grey listing exploits this failure to comply with the standard by issuing 450 failures to the first attempt by any IP address to send mail (unless it has been explicitly named in a white list). When a rejection is sent, the IP address is placed onto a list which, some time later (25 minutes by default), is added to a temporary white list and permitted to send mail. Any legitimate mail client will, then, after the initial rejection, eventually deliver the mail. Once on the provisional white list, connections from the client will be accepted for two days, so mail from regular correspondents will not be delayed.

I installed the grey list filter about 36 hours ago, and its impact has been dramatic. In the first full 24 hours it has been in production, the total number of spam messages received and discarded by Annoyance Filter has fallen from a mean of about 170 per day to fewer than 30. (Note that this total is the cumulative effect of both grey listing and the greeting and hello delays I implemented previously. A naïve calculation suggests that each accounts for about half of the reduction in spam, but to be sure the filters should be tested independently in isolation from one another.)

Reducing the raw volume of incoming spam, before filters, to about 30 messages a day represents a roll-back of the current state of the Internet slum to the situation about five years ago--here's a situation where regress is progress! The only downside of grey listing is the delay in receiving mail from new correspondents while their IP addresses percolate the the dynamic white list pending the next re-delivery attempt. I approach this philosophically--"It's only E-mail; who cares?" Looked at through these glasses ("If it mattered, they'd have sent a FAX"), filtering E-mail becomes a more tractable undertaking.

Posted at June 30, 2005 00:08