automatically from data. The developers of DNN components can indirectly influence the rules learned by a DNN by
modifying the training data, features, and the model’s architectural details (e.g., number of layers).
2. 2. DNN architecture
DNNs are inspired by human brains with millions of interconnected neurons. They are known for their amazing
ability to automatically identify and extract the relevant
high-level features from raw inputs without any human
guidance besides labeled training data. In recent years,
DNNs have surpassed human performance in many application domains due to increasing availability of large datasets,
specialized hardware, and efficient training algorithms.
A DNN consists of multiple layers, each containing multiple neurons as shown in Figure 3. A neuron is an individual
computing unit inside a DNN that applies an activation function on its inputs and passes the result to other connected
neurons (see Figure 3). The common activation functions
include sigmoid, hyperbolic tangent, or ReLU (Rectified
Linear Unit). A DNN usually has at least three (often more)
layers: one input, one output, and one or more hidden layers. Each neuron in one layer has directed connections
to the neurons in the next layer. The numbers of neurons
in each layer and the connections between them vary significantly across DNNs. Overall, a DNN can be defined
mathematically as a multi-input, multi-output parametric
function F composed of many parametric subfunctions representing different neurons.
Each connection between the neurons in a DNN is bound
to a weight parameter characterizing the strength of the
connection between the neurons. For supervised learning,
the weights of the connections are learned during training
by minimizing a cost function over the training data via gra-
dient descent.
Each layer of the network transforms the information
contained in its input to a higher-level representation of
the data. For example, consider a pretrained network as
shown in Figure 4b for classifying images into two categories: human faces and cars. The first few hidden layers transform the raw pixel values into low-level texture features such
as edges or colors and feed them to the deeper layers.
18 The
last few layers, in turn, extract and assemble the meaningful high-level abstractions such as noses, eyes, wheels, and
headlights to make the classification decision.
2. 3. Limitations of existing DNN testing
Expensive labeling effort. Existing DNN testing techniques
require prohibitively expensive human effort to provide correct labels/actions for a target task (e.g., self-driving a car,
image classification, and malware detection). For complex
and high-dimensional real-world inputs, human beings,
even domain experts, often have difficulty in efficiently performing a task correctly for a large dataset. For example,
consider a DNN designed to identify potentially malicious
executable files. Even a security professional will have
trouble determining whether an executable is malicious or
benign without executing it. However, executing and monitoring a malware inside a sandbox incur significant performance overhead and therefore make manual labeling
significantly harder to scale to a large number of inputs.
Low test coverage. None of the existing DNN testing
schemes even try to cover different rules of the DNN.
Therefore, the test inputs often fail to uncover different erroneous behaviors of a DNN. For example, DNNs are often
tested by simply dividing a whole dataset into two random
parts—one for training and the other for testing. The testing set in such cases may only exercise a small subset of
all rules learned by a DNN. Recent results involving adversarial evasion attacks against DNNs have demonstrated the
existence of some corner cases where DNN-based image
Traditional software development
Developer
Decision
Logic
Decision
Logic
Algorithm
Tuning
Feature
Selection
Training
Data
Developer
Input Output Input Output
ML system development
Figure 2. Comparison between traditional and ML system
development processes. Developers specify clear logic of the
system, whereas DNN learns the logic from training data.
Hidden layer
Input layer
Output layer
Multiclass
confidence
value
In
I1
W1(k)
W2(k)
W( 1) W( 2)
Input
x
Wi(k)Ii) Out I2
I3 W3(k)
W4(k) Activation functionσ
Individual neurons in layer k
Function modeled by DNN:
f (x) = σ(W( 2) • σ(W( 1) • x))
σ( ∑
n
i= 1
I4
Figure 3. A simple DNN and the computations performed by each
neuron.
(a) A program with a rare branch (b) A DNN for detecting cars and faces
Input (x=0) Input
if (x == 0xdeadbeef)
No Yes
/* buggy code */ Car
0.95
Face
0
Nose
0
Blue
0
Red
2. 8
VEdge
1. 1
HEdge
1. 6
Wheel
2. 4
...
...
...
...
...
...
/* no bugs */
...
Figure 4. Comparison between program flows of a traditional
program and a neural network. The nodes in gray denote the
corresponding basic blocks or neurons that participated while
processing an input.