contributions (such as reviews) or
does so only in a loose fashion (such
as ratings), undoing is relatively easy.
If the system combines contributions
tightly, but keeps them localized, then
we can still undo with relatively simple logging. For example, user edits
in Wikipedia can be combined extensively within a single page, but kept
localized to that page (not propagated
to other pages). Consequently, we can
undo with page-level logging, as Wikipedia does. Hoever, if the contributions are pushed deep into the system,
then undoing can be very difficult. For
example, suppose an inference rule
R is contributed to a KB on Day 1. We
then use R to infer many facts, apply
other rules to these facts and other
facts in the KB to infer more facts, let
users edit the facts extensively, and so
on. Then on Day 3, should R be found
incorrect, it would be very difficult to
remove R without reverting the KB to
its state on Day 1, thereby losing all
good contributions made between
Day 1 and Day 3.
At the other end of the user spectrum, many CS systems also identify and leverage influential users,
using both manual and automatic
techniques. For example, productive
users in Wikipedia can be recommended by other users, promoted,
and given more responsibilities. As another example, certain users of social
networks highly influence buy/sell decisions of other users. Consequently,
some work has examined how to automatically identify these users, and leverage them in viral marketing within
a user community.
We have discussed CS systems on
the World-Wide Web. Our discussion
shows that crowdsourcing can be applied to a wide variety of problems,
and that it raises numerous interesting
technical and social challenges. Given
the success of current CS systems, we
expect that this emerging field will
grow rapidly. In the near future, we
foresee three major directions: more
generic platforms, more applications
and structure, and more users and
First, the various systems built in the
past decade have clearly demonstrated
the value of crowdsourcing. The race is
now on to move beyond building indi-
vidual systems, toward building gen-
eral CS platforms that can be used to
develop such systems quickly.
1. aaaI-08 workshop. wikipedia and artificial
intelligence: an evolving synergy, 2008.
2. adamic, l.a., Zhang, J., bakshy, e. and ackerman,
M.s. knowledge sharing and yahoo answers:
everyone knows something. In Proceedings of W WW,
3. chai, X., Vuong, b., doan, a. and naughton, J.F.
efficiently incorporating user feedback into
information extraction and integration programs. In
Proceedings of SIGMOD, 2009.
4. the cimple/dblife project; http://pages.cs.wisc.
5. derose, P., chai, X., Gao, b.J., shen, w., doan,
a., bohannon, P. and Zhu, X. building community
wikipedias: a machine-human partnership approach.
In Proceedings of ICDE, 2008.
6. Fuxman, a., tsaparas, P., achan, k. and agrawal,
r. using the wisdom of the crowds for keyword
generation. In Proceedings of W WW, 2008.
7. Golbeck, J. computing and applying trust in web-based social network, 2005. Ph.d. dissertation,
university of Maryland.
8. Ives, Z.G., khandelwal, n., kapur, a., and cakir, M.
orchestra: rapid, collaborative sharing of dynamic
data. In Proceedings of CIDR, 2005.
9. kasneci, G., ramanath, M., suchanek, M. and weiku,
G. the yago-naga approach to knowledge discovery.
SIGMOD Record 37, 4, (2008), 41–47.
10. koutrika, G., bercovitz, b., kaliszan, F., liou, h. and
Garcia-Molina, h. courserank: a closed-community
social system through the magnifying glass. In The
3rd Int’l AAAI Conference on Weblogs and Social
Media (ICWSM), 2009.
11. little, G., chilton, l.b., Miller, r.c. and Goldman, M.
turkit: tools for iterative tasks on mechanical turk,
2009. technical report. available from glittle.org.
12. Mccann, r., doan, a., Varadarajan, V., and kramnik,
a. building data integration systems: a mass
collaboration approach. In WebDB, 2003.
13. Mccann, r., shen, w. and doan, a. Matching schemas
in online communities: a web 2.0 approach. In
Proceedings of ICDE, 2008.
14. Mcdowell, l., etzioni, o., Gribble, s.d., halevy, a.y.,
levy, h.M., Pentney, w., Verma, d. and Vlasseva, s.
Mangrove: enticing ordinary people onto the semantic
web via instant gratification. In Proceedings of ISWC,
15. Mihalcea, r. and chklovski, t. building sense tagged
corpora with volunteer contributions over the web. In
Proceedings of RANLP, 2003.
16. noy, n. F., chugh, a. and alani, h. the ckc challenge:
exploring tools for collaborative knowledge
construction. IEEE Intelligent Systems 23, 1, (2008)
17. noy, n. F., Griffith, n. and Munsen, M.a. collecting
community-based mappings in an ontology repository.
In Proceedings of IS WC, 2008.
18. olson, M. the amateur search. SIGMOD Record 37, 2
19. Perkowitz, M. and etzioni, o. adaptive web sites.
Comm. ACM 43, 8 (aug. 2000).
20. rahm, e. and bernstein, P.a. a survey of approaches
to automatic schema matching. VLDB J. 10, 4, (2001),
21. ramakrishnan, r. collaboration and data mining,
2001. keynote talk, kdd.
22. ramakrishnan, r., baptist, a., ercegovac, a.,
hanselman, M., kabra, n., Marathe, a. and shaft, u.
Mass collaboration: a case study. In Proceedings of
23. rheingold, h. Smart Mobs. Perseus Publishing, 2003.
24. richardson, M. and domingos, P. Mining knowledge-sharing sites for viral marketing. In Proceedings of
25. richardson, M. and domingos, P. building large
knowledge bases by mass collaboration. In
Proceedings of K-CAP, 2003.
26. sarwar, b.M., karypis, G., konstan, J.a. and riedl, J.
Item-based collaborative filtering recommendation
algorithms. In Proceedings of W WW, 2001.
27. steinmetz, r. and wehrle, k. eds. Peer-to-peer
systems and applications. Lecture Notes in Computer
Science. 3485; springer, 2005.
28. stork, d.G. using open data collection for intelligent
software. IEEE Computer 33, 10, (2000), 104–106.
29. surowiecki, J. The Wisdom of Crowds. anchor books,
30. tapscott, d. and williams, a.d. Wikinomics. Portfolio,
31. time. special Issue Person of the year: you,
32. von ahn, l. and dabbish, l. labeling images with a
computer game. In Proc. of CHI, 2004.
33. von ahn, l. and dabbish, l. designing games with a
purpose. Comm. ACM 51, 8 (aug. 2008), 58–67.
34. von ahn, l., Maurer, b., McMillen, c., abraham, d.
and blum, M. recaptcha: human-based character
recognition via web security measures. Science 321,
5895, (2008), 1465–1468.
35. weld, d.s., wu, F., adar, e., amershi, s., Fogarty, J.,
hoffmann, r., Patel, k. and skinner, M. Intelligence in
wikipedia. AAAI, 2008.
36. workshop on collaborative construction, management
and linking of structured knowledge (ck 2009), 2009.
37. Franklin, M, kossman, d., kraska, t, ramesh, s,
and Xin, r. crowddb: answering queries with
crowdsourcing. In Proceedings of SIGMOD 2011.
38. Marcus, a., wu, e. and Madden, s. crowdsourcing
databases: Query processing with people. In
Proceedings of CRDR 2011.
39. Parameswaran, a., sarma, a., Garcia-Molina, h.,
Polyzotis, n. and widom, J. human-assisted graph
search: It’s okay to ask questions. In Proceedings of
40. Parameswaran, a. and Polyzotis, n. answering
queries using humans, algorithms, and databases.
In Proceedings of CIDR 2011.
Anhai Doan ( firstname.lastname@example.org) is an associate
professor of computer science at the university of
wisconsin-Madison and chief scientist at kosmix corp.
Raghu Ramakrishnan ( email@example.com)
is chief scientist for search & cloud computing,
and a Fellow at yahoo! research, silicon Valley, ca,
where he heads the community systems group.
Alon Y. halevy ( firstname.lastname@example.org) heads the
structured data Group at Google research, Mountain
© 2011 acM 0001-0782/11/04 $10.00