Bias on User Interaction
One significant source of bias is user
interaction, not only on the Web, but
from two notable sources: the user
interface and the user’s own self-se-lected, biased interaction. The first is
“presentation bias,” whereby everything seen by the user can get clicks
while everything else gets no clicks.
This is particularly relevant in recommendation systems. Consider a video-streaming service in which users have
hundreds of recommendations they
can browse, though the number is
abysmally small compared to the millions that could potentially be offered.
This bias directly affects new items or
items that have never been seen by users, as there is no usage data for them.
The most common solution is called
“explore and exploit,” as in Agarwal et
al., 2 who studied a classical example
applied to the Web. It exposes part of
user traffic to new items randomly intermingled with top recommendations
to explore and, if chosen, exploit usage
data to reveal their true relative value.
The paradox of such a solution is that
exploration could imply a loss or an
opportunity cost for exploiting information already known. In some cases,
there is even a loss of revenue (such as
from digital ads). However, the only way
to learn and discover (new) good items
is exploration.
“Position bias” is the second bias.
Consider that in western cultures we
read from top to bottom and left to
right. The bias is thus to look first toward the top left corner of the screen,
prompting that region to attract more
eyes and clicks. “Ranking bias” is an important instance of such bias. Consider
a Web search engine where results pages are listed in relevant order from top
to bottom. The top-ranked result will
thus attract more clicks than the others because it is both the most relevant
and also ranked in the first position.
To avoid ranking bias, Web developers
need to de-bias click distribution so
they can use click data to improve and
evaluate ranking algorithms. 11, 12
Otherwise, the popular pages become even
more popular.
Other biases in user interaction in-
clude those related to user-interaction
design; for example, any webpage
where a user needs to scroll to see ad-
ditional content will reflect bias like
mately 70% of influential journalists in
the U.S. were men, even though at U.S.
journalism schools, the gender propor-
tions are reversed. Algorithms learning
from news articles are thus learning
from texts with demonstrable and sys-
temic gender bias. Yet other research
has identified the presence of other
cultural and cognitive biases. 10, 22
On the other hand, some Web developers have been able to limit bias.
“De-biasing” the gender-bias issue can
be addressed by factoring in the gender subspace automatically. 9
Regarding geographical bias in news recommendations, large cities and centers of
political power surely generate more
news. If standard recommendation algorithms are used, the general public
likely reads news from a capital city,
not from the place where they live.
Considering diversity and user location, Web designers can create websites that give a less centralized view
that also shows local news. 15
“Tag recommendations,” or recommending labels or tags for items, is an
extreme example of algorithmic bias.
Imagine a user interface where a user
uploads a photo and adds various tags,
and a tag recommendation algorithm
then suggests tags that people have
used in other photos based on collaborative filtering. The user chooses the
ones that seem correct, enlarging the
set of tags. This sounds simple, but a
photo-hosting website should not include such functionality. The reason
is that the algorithm needs data from
people to improve, but as people use
recommended tags, they add fewer
tags of their own, picking from among
known tags while not adding new ones.
In essence, the algorithm is doing prolonged hara-kiri on itself. If we have a
“folksonomy,” or tags that come only
from people, websites should not themselves recommend tags. On the other
hand, many websites use this idea to
provide the ability to search similar images through related tags.
Another critical class of algorithmic
bias in recommender systems is related to what items the system chooses to
show or not show on a particular web-page. Such bias affects user interaction, as explored next. There is ample
research literature on all sorts of algorithmic bias; see the online appendix
for more.
In addition to
the bias
introduced
by interaction
designers,
users have
their own
self-selection
bias.