Tapping into the “folk knowledge” needed to
advance machine learning applications.
by Pedro domingos
a few useful
MACHINE LEARNING SYSTEMS automatically learn
programs from data. This is often a very attractive
alternative to manually constructing them, and in the
last decade the use of machine learning has spread
rapidly throughout computer science and beyond.
Machine learning is used in Web search, spam filters,
recommender systems, ad placement, credit scoring,
fraud detection, stock trading, drug design, and many
other applications. A recent report from the McKinsey
Global Institute asserts that machine learning (a.k.a.
data mining or predictive analytics) will be the driver
of the next big wave of innovation.15 Several fine
textbooks are available to interested practitioners and
researchers (for example, Mitchell16 and Witten et
al.24). However, much of the “folk knowledge” that
is needed to successfully develop
machine learning applications is not
readily available in them. As a result,
many machine learning projects take
much longer than necessary or wind
up producing less-than-ideal results.
Yet much of this folk knowledge is
fairly easy to communicate. This is
the purpose of this article.
machine learning algorithms can figure
out how to perform important tasks
by generalizing from examples. this is
often feasible and cost-effective where
manual programming is not. as more
data becomes available, more ambitious
problems can be tackled.
machine learning is widely used in
computer science and other fields.
however, developing successful
machine learning applications requires a
substantial amount of “black art” that is
difficult to find in textbooks.
this article summarizes 12 key lessons
that machine learning researchers and
practitioners have learned. these include
pitfalls to avoid, important issues to focus
on, and answers to common questions.