therefore grow in importance. One
secret of ML is that human design
decisions affect learning outcomes
throughout the entire pipeline—from
early inceptions about what use or
role the ML component should play
in a larger context to deciding what
the system should predict. Since that
which the system should predict is in
turn strongly connected to the choice
of which training data to gather and
how to curate and label it, the UX
practitioner can bring these issues
to the fore at an early stage through
careful attention to the design of aptly
suited teaching environments.
To emphasize the role of UX
practice in designing systems for users
to generate training data, we will
highlight some key ideas illustrated by
examples drawn from our work within
the domain of medical imaging. We
will describe four interactive systems
that have been created and used
within digital pathology—diagnosing
and reviewing digital gigapixel-size microscopic images of tissue
samples such as biopsies and surgical
specimens.
In the design of these tools, we have
paid special attention to ensuring that
manual, unassisted workflows are
preserved and are compatible with
the assisting tools. These examples
together describe a typical two-step
process we have used when designing
new ML-based systems. First, when
no a priori data exists, we need to
bootstrap a large enough dataset
so that the algorithm used in the
first version of the system performs
sufficiently. Second, we need to ensure
that the system can collect additional
training data when it is deployed, by
receiving user corrections. This will
make the system self-sufficient on
training data, enabling an incremental
improvement of the AI performance.
Put differently, we can view this as
partitioning the design space according
to AI performance and ensuring that
the teaching interaction is suited
to the machine’s current level of
“intelligence” (Figure 1). Our four
annotated examples of this process
are based on our own experience as
UX designers active in the medical-
imaging field. Two of the examples
are prototypes, and two are finished
products that we have either designed
ourselves or followed closely.
EFFICIENT
BOOTSTRAP TEACHING
An early step in the creation of an
ML-based system, when no prior
training data exists, is to create an
initial dataset. For pathology images,
this typically consists of drawing
outlines over tissue regions and
classifying them. Because it is a highly
specialized domain, this usually
means engaging pathologists. Since it
is important to make efficient use of
these individuals and their knowledge,
it seems sensible to align the design of
the teaching environment with their
experience.
Rapid interactive refinement. A
well-known semi-automatic approach
to assigning categories to visual regions
is an interactive segmentation tool.
The user of such tools typically uses a
paintbrush-style interaction to assign
areas to given categories (called seeds);
after this is done, areas similar to the
one marked are also assigned to the
same category [ 1].
When we applied a human-centered design perspective to the
construction of such a tool, we gained
valuable insights. For our initial
prototype (Figure 2), the interaction
was experienced as a trading of
control between human and machine,
in which the human waits for the
machine response after drawing
an area. After a noticeable delay,
sometimes a few seconds, the results
are received and the human can make
a correction, wait again, and repeat
the process. Typically, the user would
be both intrigued and annoyed by
the automatic assignment of areas
that were not specifically drawn over,
sometimes resulting in long back-and-forth correction cycles without
noticeable progress.
In a revised version, we aimed for
rapid fine-grained interaction, in which
spreading would be constructed as
an incremental, collaborative effort
between user and system, rather than
being computed slowly but accurately
in every coarse-grained step (Figure
3). The tool was changed so that the
similarity threshold required for
spreading increased with the distance
from the original area. Additionally,
we added precomputations so that
results of user input typically arrive
in less than 40ms. Combined,
these changes allow working faster
and more accurately, albeit while
employing more mouse strokes. The
more fine-grained interaction lets
the user gradually develop a feel
for the underlying algorithm and
its limitations by observing many
predictions over time.
Creating intrinsic incentive
for teaching. Another approach to
bootstrapping the initial training
dataset is to design a useful manual
tool that generates training data as a
side effect. This approach is somewhat
similar to ESP [ 2], a two-player
guessing game that creates labeled
AI
Performance
No AI 70-90% accuracy
for specific problem
100% accurate
Bootstrap Design Space Collaborative Design Perfect AI
Manual
annotation
Useful;tool
Rapid refinement
Full
automation
Correctable useful tool
Figure 1. Overview of how our four designs (blue boxes) fit into a space defined by, on the one
hand, how intelligent the AI component of the system already is (AI performance) and, on the
other, the domain specificity of the interaction design effort required.