that I expect it will play a much more
prominent role in 2016.”
In Wolfers’ opinion, a year in ad-
vance of the election, the fundamen-
tals approach works well while polls do
not, because people have not started
thinking about the election yet. Polls
do a good job three months before the
election, he says, but prediction mar-
kets do the best job regardless of when
they are employed.
None of these methods involve what
is commonly known as “big data,” says
Patrick Hummel, currently a research
scientist at Google who developed a
model for forecasting elections with
David Rothschild, an economist at Microsoft Research and a Fellow of the Applied
Statistics Center at Columbia University,
during their time at Yahoo! Research.
Hummel describes the way they utilized data in his 2012 presidential prediction as simple linear regression, first
gathering from earlier elections historical data like economic indicators, presidential approval ratings, which party
was in power and for how long, and
biographical information about the
candidates. Then, he and Rothschild
compared how various pieces of data
that were available nine months before
the 2012 election correlated with the results of the earlier elections.
In February 2012, they predicted
President Obama would win the Electoral College with 303 electoral votes
to Romney’s 235, forecasting every
state correctly except for Florida,
where they predicted Obama would
lose (in fact, Obama won Florida with
hummel describes
the way they
utilized data in his
2012 presidential
prediction as simple
linear regression.
just 50.01% of the vote).
Hummel and Rothschild also accurately predicted the vote shares that
President Obama would receive in all
50 states and, after the election, determined their median error in that prediction was 1. 6 points.
“We are aware of 23 different polling
organizations that made predictions of
statewide vote shares in various states
the day before the election,” Hummel
says, “and of those 23, there was only
one that ended up with an average er-
ror that was less than 1. 6 points.”
Hummel and Rothschild’s dataset
included hundreds of historical elec-
tions—the outcomes in 50 states for
every year for the last several decades—
that totaled approximately 100,000
unique pieces of data.
“I wouldn’t classify that as big
data,” Hummel says, “which can in-
volve as many as tens of billions of
data points in one analysis. Our par-
ticular analysis, which could be done
with pencil and paper, doesn’t come
anywhere close to that.”
While “big data” might not have been
appropriate for predicting the presiden-
cy in 2012, it was what was needed for
making complex, combination predic-
tions, says David Pennock, principal re-
searcher and assistant managing direc-
tor at Microsoft Research New York City.
At Press Time
Authors Accept ACM’s OA Options
aCm officially rolled out its anticipated
publication
policy changes aimed to expand
access to its magazines, journals,
and conference proceedings on
april 2. reaction to these changes
was evident within days of their
debut—at press time, the first
authors of recent manuscript
submissions had chosen one of
the new options to manage the
publication rights of
their work.
hold exclusive ownership of
their patents and trademarks,
as well as reuse rights for any
portion of their own work
without fee or permission from
aCm. major revisions created by
an author continue to be owned
by the author, and each author
holds self-archiving rights for
accepted versions of his/her own
work in personal bibliographies
and institutional repositories.
For more information, see
http://authors.acm.org as well
as the back cover of this issue.