Spamalytics: An Empirical
Doi: 10.1145/1562164.1562190
Analysis of Spam Marketing
Conversion
By Chris Kanich, Christian Kreibich, Kirill Levchenko, Brandon Enright, Geoffrey M. Voelker,
Vern Paxson, and Stefan Savage
abstract
The “conversion rate” of spam—the probability that an
unsolicited email will ultimately elicit a “sale”—underlies
the entire spam value proposition. However, our understanding of this critical behavior is quite limited, and the
literature lacks any quantitative study concerning its true
value. In this paper we present a methodology for measuring
the conversion rate of spam. Using a parasitic infiltration of
an existing botnet’s infrastructure, we analyze two spam
campaigns: one designed to propagate a malware Trojan,
the other marketing online pharmaceuticals. For nearly a
half billion spam emails we identify the number that are
successfully delivered, the number that pass through popular antispam filters, the number that elicit user visits to the
advertised sites, and the number of “sales” and “infections”
produced.
1. iNTRoDuc TioN
Spam-based marketing is a curious beast. We all receive
the advertisements—“Excellent hardness is easy!”—but
few of us have encountered a person who admits to following through on this offer and making a purchase. And yet,
the relentlessness by which such spam continually clogs
Internet inboxes, despite years of energetic deployment of
antispam technology, provides undeniable testament that
spammers find their campaigns profitable. Someone is
clearly buying. But how many, how often, and how much?
Unraveling such questions is essential for understanding
the economic support for spam and hence where any structural weaknesses may lie. Unfortunately, spammers do not
file quarterly financial reports, and the underground nature
of their activities makes third-party data gathering a challenge at best. Absent an empirical foundation, defenders are
often left to speculate as to how successful spam campaigns
are and to what degree they are profitable. For example,
IBM’s Joshua Corman was widely quoted as claiming that
spam sent by the Storm worm alone was generating “
millions and millions of dollars every day.” 1 While this claim
could in fact be true, we are unaware of any public data or
methodology capable of confirming or refuting it.
The key problem is our limited visibility into the three
basic parameters of the spam value proposition: the cost to
send spam, offset by the “conversion rate” (probability that
an email sent will ultimately yield a “sale”), and the marginal
profit per sale. The first and last of these are self-contained
and can at least be estimated based on the costs charged by
third-party spam senders and through the pricing and gross
margins offered by various Interne marketing “affiliate
programs.”a However, the conversion rate depends fundamentally on group actions—on what hundreds of millions
of Internet users do when confronted with a new piece of
spam—and is much harder to obtain. While a range of anecdotal numbers exist, we are unaware of any well-documented
measurement of the spam conversion rate.b
In part, this problem is methodological. There are no
apparent methods for indirectly measuring spam conversion. Thus, the only obvious way to extract this data is to
build an e-commerce site, market it via spam, and then
record the number of sales. Moreover, to capture the spammer’s experience with full fidelity, such a study must also
mimic their use of illicit botnets for distributing email and
proxying user responses. In effect, the best way to measure
spam is to be a spammer.
In this paper, we have effectively conducted this study,
though sidestepping the obvious legal and ethical problems
associated with sending spam.c Critically, our study makes
use of an existing spamming botnet. By infiltrating the botnet parasitically, we convinced it to modify a subset of the
spam it already sends, thereby directing any interested
recipients to Web sites under our control, rather than those
belonging to the spammer. In turn, our Web sites presented
“defanged” versions of the spammer’s own sites, with functionality removed that would compromise the victim’s system or receive sensitive personal information such as name,
address or credit card information.
Using this methodology, we have documented three
spam campaigns comprising over 469 million emails. We
identified how much of this spam is successfully delivered,
a Our cursory investigations suggest that commissions on pharmaceutical
affiliate programs tend to hover around 40%–50%, while the retail cost for
spam delivery has been estimated at under $80 per million. 14
b The best known among these anecdotal figures comes from the Wall Street
Journal’s 2003 investigation of Howard Carmack (a.k.a. the “Buffalo Spammer”), revealing that he obtained a 0.00036 conversion rate on 10 million
messages marketing an herbal stimulant. 3
c We conducted our study under the ethical criteria of ensuring neutral
actions so that users should never be worse off due to our activities, while
strictly reducing harm for those situations in which user property was at risk.
A previous version of this paper appeared in Proceedings
of the 15th ACM Conference on Computer and Communications Security, Oct. 2008.