practice
If Google were created from scratch today, much of it
would be learned, not coded.
—Jeff Dean, Google Senior Fellow,
Systems and Infrastructure Group
MACHINE LEARNING, OR ML, is all the rage today,
and there are good reasons for that. Models created
by machine-learning algorithms for problems such
as spam filtering, speech and image recognition,
language translation, and text understanding have
many advantages over code written by human
developers. Machine learning, however, is not as
magical as it sounds at first. In fact, it is rather
analogous to how human developers create code using
test-driven development. 4 Given a training set of input-output pairs {(a,b)|a ∈ A, b∈B}, guess a function f ∈ A
→ B that passes all the given tests but also generalizes
to unspecified input values.
A big difference between human-written code
and learned models is that the latter are usually not
represented by text and hence are not understandable
by human developers or manipulable by existing tools.
The consequence is that none of the
traditional software engineering techniques for conventional programs such
as code reviews, source control, and debugging are applicable anymore. Since
incomprehensibility is not unique to
learned code, these aspects are not of
concern here.
A more interesting divergence between machines and humans is that
machines are less arrogant than humans, and they acknowledge uncertainty in their code by returning a
probability distribution or confidence interval of possible answers f ∈ A → ℙ(B)
instead of claiming to know the precise
result for every input. For example, a
learned image-recognition function
DOI: 10.1145/3052935
Article development led by
queue.acm.org
Modern applications are increasingly using
probabilistic machine-learned models.
BY ERIK MEIJER
Making Money
Using Math