modeling methods and incorporated
natural language processing techniques
to better understand diseases and
their complications. We focused on
cardiovascular disease, liver disease, and
physician-documented insomnia. This
cohort contained the complete clinical
details and demographics of patients
who received care at MGH or Brigham
and Women’s Hospital between
1992 and 2010. The cohort was large,
considering all clinical narrative notes
(e.g., office, medication management,
and operative notes) that accompanied
the traditional electronic health record
(EHR) elements (e.g., billing codes and
medication prescriptions).
A significant portion of my time as
a research fellow at Massachusetts
General Hospital (MGH) was dedicated
to the exploration of a cohort of 314,292
patients at increased risk for metabolic
syndrome [ 1]. Patients in this cohort
had at least one type 2 diabetes mellitus
( T2DM) diagnosis code, a T2DM
medication, an HGB A1C level ≥ 6. 5
percent, or plasma glucose ≥ 200 mg/
dl. Of these patients, 65,099 were
diagnosed with T2DM at a specificity
of 97 percent and positive predictive
value of 96 percent [ 2]. During my
training years (2013–2016), my
colleagues at MGH and Harvard and
I implemented a variety of predictive-
AInsights → The available machine- learning text-classification methods show only fair levels of accuracy in extracting patients’ medical conditions and behavioral descriptors. → An easily adaptable human- in-the-loop big-data method with an interactive front end may improve classification accuracy of widely used text- classification techniques.
Text Nailing:
An Efficient
Human-in-the-Loop
Text-Processing
Method
Uri Kartoun, IBM Research
INTERACTIONS.ACM.ORG NOVEMBER–DECEMBER 2017 INTERACTIONS 45
I
M
A
G
E
B
Y
A
L
I
C
I
A
K
U
B
I
S
T
A
/
A
N
D
R
I
J
B
O
R
Y
S
A
S
S
O
C
I
A
T
E
S