wi. The feature functions fi: Y × X → R; are used to score
assignments of program properties. This representation of
score functions is well-suited for learning (as the weights w
can be learned from data).
3. 4. Features as constraints
Feature functions are key to controlling the likelihood predictions. For instance, a feature function can be defined to
prohibit or lower the score of an undesirable prediction: say
if fi(yB, x) = −H where H is a very large positive number, then fi
(with weight wi > 0) essentially disables an assignment yB as
Pr(yB | x) will approach 0.
3. 5. General prediction approach
Let us first define the kind of relations between program
elements we use when making predictions. Let the set of
possible relations be Rels. An example relation we considered in our running example was L+=R. It relates variables i and t in Figure 4(c) and arises due to the expression
i+=t in the code. Examples of other relations are found
in Section 4.
3. 6. Pairwise indicator feature functions
Let be a set of pairwise feature functions where each
ψi: Labels × Labels × Rels → R scores a pair of properties when
related with a relation from Rels. For example:
In general, any feature function can be used, but our work
shows that these pairwise functions are sufficient for making
high-quality predictions of names and types. Next, we go over
the steps of the prediction procedure more formally.
3. 7. Step 1: Build dependency network
We begin by building the network Gx = 〈Vx, Ex〉 for a program x, capturing dependencies between predictions.
Here, consists of elements (e.g., variables) for
which we would like to predict properties and elements whose properties we already know . The set of
edges Ex ⊆ Vx × Vx × Rels denotes the fact that there is a
relationship between two program elements and
describes what that relationships is. This definition of
network is also called a multi-graph because there is no
restriction on having only a single edge between a pair of
nodes – our definition permits multiple dependencies
with different Rels.
We define the feature functions f(y, x) over the graph Gx as
follows. Let (y, zx) be a concatenation of the unknown properties y and the known properties zx in x. Then, fi is defined
as the sum of the applications of its corresponding ψi over
the set of all network edges in Gx:
3. 8. Step 2: MAP inference
Recall that the key query we perform is MAP inference. That
is, given a program x, find a prediction y such that:
As arg max is independent of Z(x), we obtain an equivalent
In theory, one can use any of the available inference
algorithms to solve for the above query (exact inference is in
general an NP-hard MaxSAT problem). In this work, we
designed a fast and approximate greedy MAP inference algo-
rithm tailored to our setting of programs: pairwise feature
functions, unrestricted nature of Gx and the a large set of
possible assignments. Our algorithm changes the labels of
each node one-by-one or in pairs until the assignment score
cannot be improved further.
3. 9. Learning
The goal of the training phase (lower part of Figure 1) is to
learn the weights w used in the score function from a large
training set of programs. These weights cannot be obtained
by means of counting in the training data.
14 [Section 20. 3. 1].
Instead, we use a learning technique from online support
vector machines: given a training dataset
of t samples, the goal is to find w such that the given assignments y(j) are the highest scoring assignments in as many
training samples as possible subject to additional margin
learning constraints. The learning procedure is described in
Ratliff et al.
4. PREDICTING NAMES AND TYPES
In this section we present the JSNice system for prediting:
(i) names of local variables and (ii) type annotations of function arguments. We investigate the above challenges in the
the above two questions is of significant importance. We do
note however that much of the machinery discussed in this
section applies as-is to other programming languages.
The goal of our name prediction task is to predict the (most
likely) names of local variables in a given program x. We proceed as follows. First, we define to range over all constants, objects properties, methods, and global variables of
x. Each element in can be assigned values from the set
LabelsK = JSConsts ∪ JSNames, where JSNames is a set of all
valid identifier names and JSConsts is a set of possible constants. We note that object property names and Application
Programming Interface (API) names are modeled as constants, as the dot (.) operator takes an object on the left-hand
side and a string constant on the right-hand side. We define
the set to contain all local variables of x. Here, a variable
name belonging to two different scopes leads to two program elements in . Finally, LabelsU ranges over JSNames.
To ensure the newly predicted names preserve program
semantics, we ensure the following additional constraints
hold: (i) all references to a renamed local variable must be
renamed to the same name. This is enforced by how we
define (each element corresponds to a local variable as