Overview
Purpose
Design
Implementation

The Purpose of the EBL

The Email Blocklist (EBL) is intended to filter spam that is sent from IP addresses and domains that cannot be blocked without causing significant numbers of false positives. The EBL was orignally designed to list contact email addresses (or drop boxes).

The initial target of the EBL was "Nigerian" 419 advance fee fraud spam, most of which relies exclusively on drop boxes to provide recipients with a means to respond to the spammer's offer. Other types of spam that also make heavy use of drop boxes were added later, as research indicated that the email addresses in these spams were active and could be listed without impacting innocent users. These types of spam include manufacturers of inexpensive high tech and light industrial products in China, who use mostly free email addresses and rarely provide website URLs, providers SEO/web development services, providers of direct spam services, list sellers, and a number of botnet spam operations selling a variety of mostly-illegal or fraudulent goods and services.

Pattern matching filters have been the best defense against these types of spam, but any such filter that catches spam also catches a some legitimate, non-spam email. These filters also require significant amounts of computing resources, slowing delivery.

When a pattern matching filter is used to observe email to spamtrap email addresses, however, the risks posed by false positives are minimal. The filter analyzes incoming spam, identifying and reporting drop boxes to the EBL listing engine. The listing engine vets incoming reports against a whitelist, and checks existing reports for the same hash. When it has seen a sufficient number of spam reports against a drop box, it lists that drop box.

Creating scripts to generate automated listings for the EBL is possible, feasible, and relatively easy because most spam drop boxes meet certain criteria:

The EBL also contains a small number of manual listings that meet the criteria but are not easily detected by automated methods. Most of these email addresses are used by specific spammers or spam operations that use URLs in their spam, but change their IP addresses and domains more frequently than they do their contact email addresses. These spammers can be detected by those email addresses more reliably than by IPs or domains.

Finally, the EBL contains a test entry, the SHA1 hash of noemail@example.com.

The relatively straightforward identification of drop box email addresses combined with many layers of error checking have resulted in an observed false positive rate of near zero for the past several months. Those few false positives that have been identified have prompted improvements to the trap server listing algorithms and the EBL listing engine, particularly the sanity checks and whitelist. Those features have driven the error rate even closer to zero.

The EBL team understands that beta testers will be cautious. However, most of us have long since switched from filter-and-tag to outright blocking on EBL hits. We are confident that other users will do the same after observing the EBL in action.