Using the query-style DSL for composing
and conditioning probabilistic functions
is great, but it falls short of being a real
programming language with arbitrary
control flow, loops, try/catch blocks, recursion, among others. Since distributions are a variant of the continuation
monad, it is possible to integrate probabilistic computations into a regular imperative language similar to the async
await syntax now available in many
programming languages. An example of
an imperative probabilistic programming
language is WebPPL ( http://webppl.org),
which embeds the distribution monad
into regular JavaScript. In WebPPL, the
running example looks as follows:
var cdc = function() {
return Categorical({ ps: [ 4, 6],
vs: [“obese”, “skinny”] })
}
var doctor = function(weight) {
if(“obese” == weight)
return Categorical({ ps: [ 9, 1],
vs: [“burger”, “celery”] } })
if(“skinny” == weight)
return Categorical({ ps: [ 3, 7],
vs: [“burger”, “celery”] } })
}
var predict = function(food) {
var weight = sample(cdc())
var food _ =
sample(doctor(weight))
condition(food == food _ )
return weight;
}
The assignment + sample statement
var a = sample(prior)
… rest of the program...
is exactly like the query
fragment
from a in prior
… rest of the query ...
and randomly picks a value a ∈ A from
a distribution prior ∈ ℙ(A). The
condition(p) statement corresponds
to a where clause in a query.
To “run” this program, we pass the
predict function into the WebPPL inference engine as follows:
Infer({method: enumerate,
samples: 10000},
function(){return
predict(“burger”)})
This samples from the distribution
described by the program using the
Infer function with the specified sampling method (which includes enumerate, rejection, and MCMC) that reifies
the resulting distribution into a Bayesian representation.
Applications of
Probabilistic Programming
Suppose ordinary developers had access to a probabilistic programming
language. What scenarios would this
open up?
If we take a step back and look at a
typical Web or mobile application, it
implements the standard reinforcement learning design pattern shown in
Figure 5. We have to predict an action
to send to the user, based on the user’s
state and the dollar value extracted
from the user, such that the sum of the
rewards over time is maximized.
For games such as AlphaGo, 10 the
agent code is often a neural network,
but if we abstract the pattern to apply
to applications as a whole, it is likely
a combination of ML learned models and regular imperative code. This
hybrid situation is true even today
where things such as ad placement
and search-result ranking are probabilistic but opaquely embedded into
imperative code. Probabilistic programming and machine learning will
allow developers to create applications that are highly specialized for
each user.
One of the attractions of IDEs (inte-
grated development environments) is
autocomplete, where the IDE predicts
what a user is going to type, based on
what has been typed thus far. In most
IDEs, autocomplete is driven by static
type information. For example, if the
user types ppl, the JetBrains Rider IDE
shows all the properties of the string
type as potential completions, as
shown in Figure 6.
Note that the completion list is
shown in deterministic alphabetical
order, rather than being probabilis-tically ranked using some learned
model based on which methods on
string are the most likely in the given
context. Hence, the IDE should implement autocomplete using a probabilistic function autoComplete ∈
ℙ([Completion]|Context) that
returns a distribution of possible
completions based on the current
user context. 7 Another recent application of ML and probabilistic programming in the compiler
space is to infer pretty-print rules
by learning from patterns in a large
corpus of code prettyℙrint ∈
ℙ(String|AST). 5
For an example application of
exposing the representation of distributions, let’s revisit the feedback
loop between user and application.
Figure 5. Standard reinforcement learning
design pattern.
[state, $]
agent
user
[action]
Figure 6. Autocomplete example.