scale I sought to create. For example,
in this initial step I defined a wearable
as a computer or electronic device
that is worn on the body (on skin or
clothing). My definition excluded fabric
or devices inside or under clothing
because the scale addresses wearables
that are visible. I further defined
wearables as personal and personally
owned, as opposed to provided by
an employer as a work tool. While a
general definition of wearables is much
broader than this, such focus was
necessary for the social-acceptability
measurement to make sense.
Step 2 was to take the findings
from Step 1 and write possible scale
items. An initial item pool should
contain about four or five times as
many items as the anticipated final
scale, because the consequent steps
in scale development are essentially a
process of elimination. This process
thus resulted in 97 potential items. An
example of this scale-writing process
was to take the interview finding that a
socially acceptable device is accessible,
affordable, and not in limited release,
which became the item: This device
seems to be accessible, that is, affordable
and not in limited release.
Step 3 was to determine the scale
format. Based on the items I had
generated, a Likert Scale in which
respondents would rate their level
of agreement or disagreement with
each item made the most sense. The
number of response choices should be
sufficient to allow for variation, but not
so numerous that differences between
response choices become meaningless.
Six or seven choices are most common
for these reasons. I chose six, to avoid
a middle “neutral” choice and to force
respondents to at least lean toward
agreement or disagreement.
Step 4 was to have experts review
and provide feedback on the initial
item pool. I recruited three experts:
Two held a Ph.D. (one was an academic
and one worked in industry) and the
other was a CEO of a small fashion-forward wearable company. They rated
the relevancy of each item to what I was
attempting to measure (wearable social
acceptability) and also gave feedback
on clarity, conciseness, and anything
I might have missed. This process
resulted in some edits, as well as a
whittling down to 50 items.
Step 5 was to choose related items
or scales for the purpose of testing
the construct validity of what would
become the final WEAR Scale. Based
on existing research, I hypothesized
that a valid scale would be positively
correlated with the Affinity for
Technology Scale, self-reported
optimism, and likeableness rating of a
person wearing a device, but negatively
correlated with age.
Step 6 was to have a sample of people
respond to the 50 items, as well as to
the related items mentioned in Step
5, so I could conduct exploratory
factor analysis and validity testing.
Of course, a wearable was needed so
participants could respond to the items
about a particular device. To gather the
data I needed, I did one study using a
Bluetooth headset as the stimulus, and
another study in using an Apple Watch
and Google Glass. This allowed me to
look for commonalities among three
quite different wearables in forming
the final scale. I chose those the three
devices for their diversity of functions
and body placement, and anticipated
variation in how they would rate in
acceptability.
Step 7 was to evaluate the items
using exploratory factor analysis,
adjust the scale as needed, and test its
validity and reliability. The common
solution shared by all three datasets
(the headset, Watch, and Glass) is
shown in Figure 1. This solution
showed good validity and reliability [ 3]
and became the final scale.
DEVICES AND DATA
For the three wearables I tested,
Google Glass landed at 3. 12 on the
6-point scale, below the median of
3. 50. The Bluetooth headset at 3. 57
was a bit above the median, and the
highest score was for Apple Watch at
4.06 [ 3]. Interestingly, both Glass and
Watch had the same items with the
most extreme scores. The scale item
that was the lowest scoring was This
device would be distracting when driving.
Respondents felt that the potential
harm that either Glass or Watch could
F
Figure 2. Two key factors contribute to the WEAR score.