expressions that were found in at least
0.05 percent of the notes, as shown in
Figure 4.
DISCUSSION
This article presents a novel method
of determining smoking status
using clinical narrative notes. TN
substantially relies on the fact
that physicians tend to use similar
expressions to describe medical
conditions and, further, tend to
use these expressions consistently.
Converting all expressions and notes
to alphabetical-only representations
eliminates the heterogeneity in
the descriptions of the medical
descriptors and allows a perfect
C
O
M
M
U
N
I
C
AT
I
O
NS
A
P
P
S
Available for iPad,
iPhone, and Android
Available for iOS,
Android, and Windows
http://cacm.acm.org/
about-communications/
mobile-apps
Access the
latest issue,
past issues,
BLOG@CACM,
News, and
more.
p
o
or
slee
p
in
cre
a
sed
slee
p
slee
pdi
sr
up
ti
o
n
de
cre
a
sed
slee
p
s
lee
ple
ss
ne
s
s
exce
s
sive
slee
p
ha
stro
u
ble
slee
p
s
leep
sp
o
or
ly
red
uced
slee
p
fra
gme
n
ted
slee
p
slee
pp
o
or
ly
i
mp
aired
slee
p
h
a
sa
s
leep
dis
o
rde
r
h
a
sdiffic
u
lt
ywit
h
slee
p
h
a
ss
leep
pro
ble
m
h
a
ss
leepi
ng
pro
ble
m
terri
ble
slee
p
ha
s
diffic
ul
tie
swit
h
slee
p
h
a
sdiff
icu
lt
yg
oi
ngt
o
slee
p
h
a
ss
leep
diffic
ul
t
ha
s
sleepi
s
s
ue
h
as
tro
ub
leg
oi
ngt
o
slee
p
h
a
ss
leep
dis
o
rde
r
ha
sdi
ffic
ult
yt
o
slee
p
ha
s
a
sleepi
s
s
ue
ha
s
as
leep
pro
ble
m
h
a
s
as
leepi
ng
pro
ble
m
0.25%
0.20%
0.15%
0.10%
0.05%
0.00%
Figure 5. Most prevalent sleep disorder expressions. The y-axis represents the percentage of
notes that contain the expression within a sample of 1 million randomly selected notes.
0.05%
0.04%
0.03%
0.02%
0.01%
0.00%
ha
sah
ist
or
y
ofalcoh
ol
p
ro
ble
msal
co
hola
bus
e
use
da
lcohol
yes
hi
st
or
yo
f
de
p
res
si
onan
dal
co
hola
bus
e
al
coholu
ses
ta
tusa
bus
e
h
xo
f
de
p
res
si
onan
dal
co
hola
bus
e
e
xcess
i
vealcoh
ol
in
ta
ke
heh
asah
is
to
r
yofal
co
hola
bus
e
pas
tal
co
hola
bus
e
alcoh
ola
bus
ere
cove
r
in
g
h
ealsoh
asah
is
to
r
yofal
co
hola
bus
e
a
lcoho
lab
use
in
reco
ve
r
y
h
eals
ohasa
his
to
r
yo
fch
ron
ical
co
hola
bus
e
is
ak
nown
alcoho
li
c
i
sk
nown
alcoho
li
c
hasa
re
m
ot
eh
is
to
r
yofal
co
hola
bus
e
w
i
thh
is
to
r
yofal
co
hola
bus
e
ha
sh
ist
or
y
ofalcoh
ol
alcoho
la
buse
hea
v
y
d
ri
nk
in
g
hasa
hoal
co
hola
bus
e
a
d
mi
ts
to
p
r
io
rh
is
to
r
yofal
co
hola
bus
e
h
ehasa
his
to
r
yo
fch
ron
ical
co
hola
bus
e
h
ist
o
r
yis
no
ta
ble
fo
rhea
v
yal
co
hola
bus
e
Figure 6. Most prevalent alcohol-use expressions. The y-axis represents the percentage of
notes that contain the expression within a sample of 10,000 randomly selected notes.