Here, we describe two key developments that characterize the opportunities and challenges we face.
Datafication refers to the development that our everyday activities are
traced digitally at unprecedented scale
and accuracy for commercialization
and exploitation in a data economy.
Datafication raises questions about
how this situation can or should be
managed and what might result out
of its pervasiveness. The processes of
datafication, their consequences and
how we live with these are both social
and technical. From the beginning,
the question of what data is created
depends both on human activities and
technical devices.
How this data is used depends on
configurations of ownership, markets,
state authority, and citizens’ rights
as well as the technical affordances
for circulation through technical infrastructures and the computational
possibilities for analysis. To even describe the processes of datafication
demands expertise of the highest level
from computer science, law, political
science, sociology, and more. To consider if and how society might respond
to this new landscape likewise. What
are the opportunities to flip data ownership from the big tech companies to
the individuals whose data fuels the
data economy? Engineering solutions,
as developed in the SoLiDe project, may
be part of the response, but how can we
be sure that people even want let alone
will have the capacity to use these solutions? What new challenges might
these solutions pose? How would this
impact on the underlying business
model for the Web?
The digital divide. Web access con-
tinues to rise rapidly but over three bil-
lion people worldwide have no access,
and 1: 8 of the European population
does not use the Web regularly.f We
should avoid normative claims that the
Web is ‘good’ for everyone, we know
now that this is not the case, yet at the
same time this should be a matter of
choice not constraint. Further, beyond
the question of access alone, we see an
increasing divide between those highly
skilled users who are able to derive the
e https://solid.mit.edu/
f https://www.statista.com/topics/3853/inter-
net-usage-in-europe/ref.
greatest benefit and those less skilled
who are less knowledgeable about
privacy risks, less able to protect their
security and may derive less economic
benefit from the opportunities avail-
able online.
17 So long as people are
unaware of the technical mechanisms
and social uses of datafication or the
potential effects of this on their lives
and life chances they will not be able
to make effective choices about how
to use the Web or join the public de-
bate about the future of the Web. Web
Science calls for new approaches to
digital literacy, beyond the use of Web
tools and beyond the extension of cod-
ing skills to schools (important as both
these are) to build understanding of
the Web as a sociotechnical system and
drive toward greater empowerment of
Web citizens. It engages, for example,
through the Web We Want campaign,
#fortheweb, and educational interven-
tions.
11
Both these examples are linked to
wider practical, political, and philosophical questions. What are the
checks and balances with regard to
openness and privacy? What forms of
transparency and accountability are
appropriate and achievable, to balance
individual privacy, fairness across social groups and a viable business model for the future of the Web? How do we
engage the public in meaningful dialogue and decision making about the
future of the Web?
Next, we investigate another most
prominent sociotechnical challenge
in more detail that today is most often
characterized as a technical challenge
alone, whereas it is deeply entrenched
into the way that we as individuals or
as society interact with each other and
with the artifacts we create.
Web and Artificial Intelligence
The Web and its infrastructure has be-
come interwoven not only with docu-
ments, but also with data, services,
things—and artificial intelligences.
Initially, the Web was a field of ap-
plication for artificial intelligence.
Knowledge-based systems and ma-
chine learning were used to provide in-
telligent access to information on the
Web, to enhance search, to facilitate
browsing or to negotiate in electron-
ics market. In hindsight, this may be
considered to have been a very useful,
Yet since the end of its first decade,
there was a vision to build a Web that
was intelligent in itself, that included
agents that would assist its users.
6 As
this objective was beyond reach then,
the Semantic Web community in-
creasingly refocused on what became
a proverb that data with a little seman-
tics goes a long way. When researchers
started to properly understand and use
the social motivation of Web develop-
ers and Web content managers, some
European researchers developed what
now has become the two most popular
Semantic Web applications, Wikida-
ta27 and Schema.org.g At the same time
Web Science was coined as a field that
would address the systematic under-
standing of these socio-technical in-
teractions between Web and humans.
7
At the end of the second decade of
the Web, artificial intelligence took
several major turns. Big data, which
frequently came from the Web directly
or from crowdsourcing on the Web, became the foundation for human-like
performance on some tasks such as
image annotations.
19 At the same time
chatbots and virtual assistants have
been developed and are now widely
found on our PCs, smartphones, and
in our homes. The latest developments
let these virtual assistants acquire
their knowledge from the Web, from
archived dialogues,
12 or from live interaction.
Microsoft researchers were pushing the edge and put their AI chatbot
“Tay” online to interact with and learn
from human encounters. Humans
quickly taught it to go ˛<<from “humans
are super cool” to full nazi in <24hrs>>.h
While there was a wide discussion that
the technology was inadequate, there
seemed to have been little understanding that it was the social context and
the social processes that determined
the fate of Tay. While in the initial Semantic Web, the lack of such understanding led to a simple, but not very
problematic non-adoption, in the case
of Tay being an active agent the lack of
g Schema.org was an agreement of several
search companies modeled after the preceding Yahoo! SearchMonkey system.
21
h https://www.theverge.com/2016/3/24/
11297050/tay-microsoft-chatbot-racist