the growth and behaviors of the future
Web, as well as to engineer systems
with desired properties in a way that is
significantly less hit or miss.
from Power Laws to People
Mathematically based analysis of the
Web involves another potential failing.
Whereas the structure and use of various Web sites (taken mathematically)
may have interesting properties, these
properties may not be very useful in explaining the behavior of the sites over
time. Consider the following example:
Wikipedia ( www.wikipedia.org), the
the linguistic content of its pages. The
figure shows the same kind of Zipf-like
distribution found in the original Web
graph analyses. There is also some evidence16 and a lot of speculation29 that
similar effects can be seen in the use
of tags in Web-based tagging systems.
Current research is also exploring
whether these results depart from such
models as preferential attachment3
used to explain the scale-free features
of Web graphs.
Unfortunately, whatever explains
these effects, another aspect of Wikipedia’s use is not explained by these
Figure 3: Results of an analysis of the link structure of Wikipedia with
respect to the use of link labels, not the linguistic content of pages.
+Predicate occurrence distribution
1
0.1
0.01
P(k)
0.001
0.0001
1e-05
1
10
100
1000 10000
k
100000 1e+06
1e+07
online wiki-based encyclopedia, includes more than two million articles
in English and more than six million
in all languages combined. They are
hyperlinked, and it is logical to ask
whether the hyperlinks have structure
similar to those on the Web in general
or whether, since this is a managed corpus, they have yet other properties.
Answering can be done in a number of ways; Figure 3 shows the result
of one of them. In this case, DBPedia
( dbpedia.org), which is a dump of the
link structure of Wikipedia using the
labeled links of the resource description framework, or RDF, has been analyzed with respect to the use of the link
labels; that is, we are looking at the
structure of Wikipedia as opposed to
models and does not necessarily follow
from these properties. Wikipedia is
built on top of the Media Wiki software
package ( www.mediawiki.org/wiki/Me-dia Wiki), which is freely available and
used in many other Web applications
besides Wikipedia. While some of
them have also been successful, many
have failed to generate significant use.
A purely “technological” explanation
cannot account for this; rather, something about the organizational structures of Wikipedia and the needs of its
users accounts for its success over other
systems built from the same code base.
The model by which articles are created, edited, and tracked is provided by
the underlying technology. The social
model enabled by humans interacting
in ways allowed by that technology is
more difficult to explain. The dynamics of any “social machine” are highly
complex, and dozens of academic papers, from multiple disciplines, have
been written about it; en.wikipedia.
org/wiki/Wikipedia:Wikipedia_in_aca-demic_studies uses Wikipedia itself to
maintain an up-to-date reference list.
The idea of a social machine was
introduced in Weaving the Web, 8 which
hypothesized that the architectural
design of the Web would allow developers, and thus end users, to use computer technology to help provide the
management function for social systems as they were realized online. The
social machine includes the underlying
technology (mediaWiki in the case of
Wikipedia) but also the rules, policies,
and organizational structures used
to manage the technology. Examples
abound on the Web today. Consider
the coupling of the application design
of blogging-support systems (such as
LiveJournal and WordPress) with the
social mechanisms provided by blog-rolls, permalinks, and trackbacks that
have led to the so-called blogosphere.
Similarly, the protocols used by social
networking sites like MySpace and Facebook have much in common, but the
success or failure of the sites hinges
on the rules, policies, and user communities they support. Given that the
success or failure of Web technologies
often seems to rely on these social features, the ability to engineer successful
applications requires a better understanding of the features and functions
of the social aspects of the systems.b
Today’s interactive applications are
very early social machines, limited by
the fact that they are largely isolated
from one another. We hypothesize that
(i) there are forms of social machine
that will someday be significantly more
effective than those we have today; (ii)
that different social processes interlink
in society and therefore must be interlinked on the Web; and (iii) that they
are unlikely to be developed through a
single deliberate effort in a single proj-
b. When we say “success” or “failure,” we are referring not to the business factors that determine whether, for example, Facebook or MySpace will attract more users but to the success
or failure of the sites to provide the particular
types of social interaction for which they are
designed.