Figure 1: The fluctuations in volume of Twitter messages sent during U.S. President Barack Obama’s inauguration show moments in time when users paid more
attention to the ceremony (drop in volume) than to their computer use.
topic boundaries with about 92 percent accuracy [ 2]. All this is done without looking at any text.
The question then became, “Can
this method of ‘naïve tweet counting’
scale?”
twitter’s explosion
Twitter has grown at an incredibly fast
rate. In January of 2009, CNN Breaking
Ne ws had around 86,000 follo wers. Nine
months later, its follower count had exceeded 2. 7 million. T witter began offering data-mining push feeds that hands
back 600 tweets a minute. Track feeds
push all the tweets related to a search,
and Twitter’s “Firehose” can deliver everything if you can get your hands on it.
I previously sampled 3,000 tweets from
a 90-minute debate. I saw 52,000 tweets
from a 90-minute sample of President
Obama’s inauguration speech.
Addressing scale, I revisited the
human-centered observation I origi-
nally questioned at Dolores Park. If we
position people tweeting while watch-
ing a live event, either in person or on
television, conversation is represented
as talking to someone else: in effect
directing a tweet with the symbol
(using “@” before a username like @ba-
rackobama pushes a highlighted tweet
into that person’s feed). The highlight-
ed tweet becomes more visible; it calls
attention to the user, like tapping your
friend on the shoulder in a crowd.
utes into 5-minute chunks, then find a
highly salient slice term that is not salient in the other chunks. Looking back
at the inauguration, this produces topic segments like: booing, Aretha, Yo-Yo
Ma, anthem, and so on. Couple these
words with our importance proxy, and
we find the illuminating moments
of Aretha Franklin’s performance,
Obama’s speech, and the nation anthem to have greater importance than
the other moments.
The signal of human activity here is
clean and simple. Our analysis was done
on a MacBook Pro using R and MatLab,
without the need for large-scale assistance from a Hadoop cloud or Mechanical Turk.
Tweets can be used in social-multi-media research, when facilitated
through human-centered research, to
identify the shape of the related event.
More than just text alone, the form of
the conversational shadow has to account for the structure of tweets themselves. Hashtags, mentions, and other communication mechanism, both
structured and ad-hoc, follow but likely
do not mirror the visual content of the
event itself. This link between two disparate data streams (online conversations and live events) provides rich opportunities for further investigation.
As we explore new approaches to
better navigate, communicate, visualize, and consume events, our tools and
research should firmly be based in the
conversation structure, be it Twitter,
Facebook, or Flickr, and be motivated
by our interactions in everyday life, less
we miss the game in the park.
Biography
David Ayman Shamma is a research scientist in the Internet
Experiences group at Yahoo! Research. He researches
synchronous environments and connected experiences
both online and in the world. He designs and prototypes
systems for multimedia-mediated communication, and
develops targeted methods and metrics for understanding
how people communicate online in small environments and
at web scale. Ayman is the creator and lead investigator on
the Yahoo! Zync project.
a taBle oF contents
Finally, when we turn to examining the
text itself, a table of contents emerges.
For this pick your favorite off-the-shelf
information retrieval tool; mine is TF/
IDF. What we want is a very specific table of contents, so we split the 90-min-
References
1. Cesar, P., Geerts, D., and Chorianopoulos, K. Social
Interactive Television: Immersive Shared Experiences
and Perspectives. Information Science Reference, 2009.
2. Shamma, D. A., Kennedy, L., and Churchill, E. F. Tweet
the debates: Understanding community annotation of
uncollected sources. In WSM ’09: Proceedings ofthe
international workshop on Workshop on Social Media
(Beijing, China, 2009), ACM.