Our algorithm ensures this property by modifying the gradient grad such that xi+ 1 = xi + s . grad still satisfies the constraints (s is the step size in the gradient ascent).
For discrete features, we round the gradient to an integer. For DNNs handling visual input (e.g., images), we add
different spatial restrictions such that only part of the input
images is modified. A detailed description of the domain-specific constraints that we implemented can be found in
Section 5. 2.
Hyperparameters. To summarize, there are four major
hyperparameters that control different aspects of DeepXplore
as described below. ( 1) λ1 balances the objectives between
minimizing one DNN’s prediction for a certain label and
maximizing the rest of DNNs’ predictions for the same
label. Larger λ1 puts higher priority on lowering the prediction value/confidence of a particular DNN, whereas
smaller λ1 puts more weight on maintaining the other
DNNs’ predictions. ( 2) λ2 provides balance between finding differential behaviors and neuron coverage. Larger
λ2 focuses more on covering different neurons, whereas
smaller λ2 generates more difference-inducing test inputs.
( 3) s controls the step size used during iterative gradient
ascent. Larger s may lead to oscillation around the local
optimum, whereas smaller s may need more iterations
to reach the objective. ( 4) t is the threshold to determine
whether each individual neuron is activated or not. Finding
inputs that activate a neuron becomes increasingly harder
as t increases.
5. EXPERIMENTAL SETUP
5. 1. Test datasets and DNNs
We adopt 5 popular public datasets with different types of
data—MNIST, ImageNet, Driving, Contagio/Virus Total, and
Drebin—and then evaluate DeepXplore on 3 DNNs for each
dataset (i.e., a total of 15 DNNs). We provide a summary of
the five datasets and the corresponding DNNs in Table 1.
The detailed description can be found in the full paper. All
the evaluated DNNs are either pretrained (i.e., we use public
weights reported by previous researchers) or trained by us
using public real-world architectures to achieve comparable
performance to that of the state-of-the-art models for the
corresponding dataset. For each dataset, we used DeepXplore
to test three DNNs with different architectures.
5. 2. Domain-specific constraints
As discussed earlier, to be useful in practice, we need to
ensure that the generated tests are valid and realistic by
applying domain-specific constraints. For example, gener-
ated images should be physically producible by a camera.
Similarly, generated PDFs need to follow the PDF speci-
fication to ensure that a PDF viewer can open the test file.
Below we describe two major types of domain-specific con-
straints (i.e., image and file constraints) that we use in this
paper. Image constraints (MNIST, ImageNet, and Driving).
DeepXplore used three different types of constraints for
simulating different environmental conditions of images:
( 1) lighting effects for simulating different intensities of
lights, ( 2) occlusion by a single small rectangle for simulat-
ing an attacker potentially blocking some parts of a camera,
Given an arbitrary x as seed that gets classified to the same
class by all DNNs, our goal is to modify x such that the modi-
fied input x′ will be classified differently by at least one of
the n DNNs.
Let Fk(x)[c] be the class probability that Fk predicts x to be
c. We randomly select one neural network Fj and maximize
the following objective function:
where λ1 is a parameter to balance the objective terms
between the DNNs’ Fk≠j that maintain the same class outputs as before and the DNN Fj that produce different class
outputs. As all of Fk∈
1..n are differentiable, Equation 2 can be
maximized using gradient ascent by iteratively changing x
based on the computed gradient: .
Maximizing neuron coverage. The second objective is
to generate inputs that maximize neuron coverage. We
achieve this goal by iteratively picking inactivated neurons
and modifying the input such that the output of that neuron
goes above the neuron activation threshold. Let us assume
that we want to maximize the output of a neuron n, that is,
we want to maximize obj2(x) = fn(x) such that fn(x) > t, where
t is the neuron activation threshold, and we write fn(x) as
the function modeled by neuron n that takes x (the original
input to the DNN) as the input and produce the output of
neuron n (as defined in Equation 1). We can again leverage
the gradient ascent mechanism as fn(x) is a differentiable
function whose gradient is .
Note that we can also potentially jointly maximize multiple neurons simultaneously, but we choose to activate one
neuron at a time in this algorithm for clarity.
Joint optimization. We jointly maximize obj1 and fn
described above and maximize the following function:
where λ2 is a parameter for balancing between the two objectives and n is the inactivated neuron that we randomly pick
at each iteration. As all terms of objjoint are differentiable, we
jointly maximize them using gradient ascent by modifying x.
Domain-specific constraints. One important aspect of
the optimization process is that the generated test inputs
need to satisfy several domain-specific constraints to be
physically realistic. In particular, we want to ensure that the
changes applied to xi during the ith iteration of gradient
ascent process satisfy all the domain-specific constraints for
all i. For example, for a generated test image x, the pixel values must be within a certain range (e.g., 0–255).
Although some such constraints can be efficiently
embedded into the joint optimization process using the
Lagrange Multipliers similar to those used in support vector machines, we found that the majority of them cannot be
easily handled by the optimization algorithm. Therefore, we
designed a simple rule-based method to ensure that the generated tests satisfy the custom domain-specific constraints.
As the seed input xseed = x0 always satisfies the constraints by
definition, our technique must ensure that after the ith (i > 0)
iteration of gradient ascent, xi still satisfies the constraints.