of equal criticality. In this case, the tightness probability
is close to 0.5 and the linear approximation of the max
operator becomes inaccurate, as illustrated below. As an
example, in a perfectly balanced tree with 128 paths we
have a node-based algorithm predicting a timing yield of
90% but the actual yield might be only 72% at some circuit
We propose a more reliable statistical estimation strategy
using timing analysis that looks at a complete or large set
of paths through the timing graph. The algorithm derives
bounds on the circuit delay distribution using the distributions of path delays and their interdependencies.
path tracing imposes an additional cost, it also makes it
easier to handle more complex gate delay models, for
example, to accurately model the delay dependency on the
slope of the driving signal. Delay correlations introduced by
path-sharing and joint impact of inter-chip and intra-chip
variations are naturally taken into account, improving the
The major difference between this algorithm and node-based methods is the focus on bounding rather than
approximating the circuit distribution; this avoids the
multiple approximations involved in node-based traversal.
Experiments indicate that produced bounds are quite tight.
We are specifically interested in the lower bound on the cdf
of circuit delay since it provides a conservative value of the
circuit delay at any confidence level. For example, across the
benchmarks the error between the exact distribution and
the lower bound is 1. 1–3.3% (95th percentile).
The cost of this greater accuracy is longer runtimes. In
the worst case, the number of paths in the circuit increases
exponentially with the number of nodes. This can occur in
arithmetic circuits, such as multipliers, but the number of
paths for most practical circuits is actually quite manageable. A study of a class of large industrial circuits found
the number of paths ≈ 0.12 × gates1.42.14 The availability of
parallel multi-core systems makes storing and manipulating millions of paths practical. Recent industrial offerings
of transistor-level static timing analysis, where the accuracy requirement is higher than for the gate-level analysis,
have used path-based analysis and demonstrated fast high-capacity multithreaded implementations.
Our algorithm separates the path extraction step from
the statistical analysis of path delay distributions. It first
extracts a set of the top N longest paths using deterministic STA. This is done using edge delays set to their mean values to extract paths with delays within a certain pre-defined
range from the maximum delay. Path extraction can be done
with time complexity of O(N), which means only paths larger
than a given threshold need to be traced.
The result of the path extraction step is a set of N paths
with their mean delay values (mDi), variances (s
2 Di), and the path
correlation matrix (S). The algorithm aims to find the distribution of the maximum delay of this set of paths. The complete description of a set of path delays is given by the cdf of
D: , where Di is the delay of the ith path
in the circuit and F(t) is the cumulative distribution function
defined over the circuit delay probability space. The overall
worst-case complexity of the algorithm is set by the path delay
98 communIcATIons of The Acm | auGust 2009 | Vol. 52 | no. 8
correlation matrix generation and is O(N2m2), where N is the
number of paths extracted and m is the maximum number of
gates on the extracted paths.
The main contribution of our work is an algorithm that
computes the upper and lower bounds on the probability
distribution of circuit delays for a timing graph with correlated random edge delays. The algorithm uses a linear
model of edge delays as a function of process parameters. We
assume that process parameters are Gaussian, and under
the above linear model, the edge delays are also Gaussian.
The algorithm is based on a set of transformations that
bound the circuit delay distribution by probabilities in the
form of equicoordinate vectors with well-structured correlation matrices; these probabilities can be numerically
precharacterized in an efficient manner.
4. 1. Bounding the delay distribution
First, we express F(t) in terms of the distribution of a standard multivariate normal vector:
1. For any normal random vector with a correlation matrix given by S
where ti′ = (t − mDi ) / sDi and Zi ∼ N(0, 1) are the standard normal random variables.
The transformation introduces the vector t′ that determines the set over which the probability content is being
evaluated. Note that the components of the vector are not
equal. Also note that the correlation matrix S that characterizes the path delay vector is populated arbitrarily and
has no special structure. Both of these factors make the
immediate numerical evaluation of the above probability
Second, we express the cumulative probability of Theorem
1 by a probability computed for a vector with a well-structured
correlation matrix. We utilize a property unique to multivariate Gaussian distributions: their probabilities are monotonic
with respect to the correlation matrix. Slepian has shown
that by increasing the correlation between the members of
the Gaussian vector, their probability volume over certain
Theorem 2. Let X be distributed as N(0, S), where S is a correlation matrix. Let R = (rij) and T = (tij) be two positive semidefinite correlation matrices. If rij ≥ tij holds for all i, j, then
for all a = (a1, …, aN) T, we have
It is easy to see that, as a special case, the above relation can
be used to bound the sought probability with probabilities
whose correlation matrices have identical off-diagonal components. Therefore, as a special case, the cumulative probability can be bounded from below and above by