Following this approach, we could
find the new HITs being posted over
time, the completion rate of each HIT,
and the time that they disappear from
the market because they have either
been completed or expired, or because
a requester canceled and removed
the remaining HITs from the market.
(Identifying expired HITs is easy, as
we know the expiration time of a HIT.
Identifying cancelled HITs is a little
trickier. We need to monitor the usual
completion rate of a HIT over time and
see if it is likely, at the time of disappearance, for the remaining HITs to
have been completed within the time
since the last crawl.)
A shortcoming of this approach is that
it cannot measure the redundancy of the
posted HITs. So, if a single HIT needs to
be completed by multiple workers, we
can only observe it as a single HIT.
The data are also publicly available through the website http://www.
mturk-tracker.com [ 1].
From January 2009 through April
2010, we collected 165,368 HIT groups,
with 6,701,406 HITs total, from 9,436
requesters. The total value of the posted HITs was $529,259. These numbers,
of course, do not account for the redundancy of the posted HITs, or for HITs
that were posted and disappeared between our crawls. Nevertheless, they
should be good approximations (
within an order of magnitude) of the activity of the marketplace.
Table 1: Top Requesters based on the total posted rewards available to a single
worker (January 2009–April 2010).
TOP REQUESTERS AND
FREQUEN TLY POSTED TASKS
One way to understand what types of
tasks are being completed in the marketplace is to find the “top” requesters
and analyze the HITs that they post.
Table 1 shows the top requesters, based
on the total re wards of the HI Ts posted,
filtering out requesters that were active
only for a short period of time.
We can see that there are very few
active requesters that post a significant
amount of tasks in the marketplace
and account for a large fraction of the
posted rewards. Following our measurements, the top requesters listed in
Table 1 (which is 0.1 percent of the total requesters in our dataset), account
for more than 30 percent of the overall
activity of the market.
Given the high concentration of the
market, the type of tasks posted by
the requesters shows the type of tasks
that are being completed in the mar-
ketplace. Castingwords is the major
requester, posting transcription tasks
frequently. There are also two other
semi-anonymous requesters posting
transcription tasks as well.
participants that has significantly
lower activity than the top contributors. Figure 1 shows how this activity
is distributed, according to the value of
the HITs posted by each requester. The
x-axis shows the log2 of the value of
the posted HITs and the y-axis shows
what percentage of requesters has this
level of activity. As we can see, the dis-
Figure 1: Number of requesters vs. total