is a set of edges from the source node to the sink node in
G. The delay of the path Di is the sum of the weights d for all
edges on the path. The goal of SSTA is to find the distribution of maximum (D1, …, DN) among all N paths in G.
Using a statistical treatment for STA requires an efficient
method for processing probabilistic timing graphs with an
arbitrary correlation of edge delays to find the distribution
of maximum propagation time through the graph. This challenge has received significant attention by the electronic
design automation (EDA) community.
1, 3, 8, 9, 17, 18
Edge delay distribution depends on the distribution of
the underlying semiconductor process parameters, which
is typically assumed to be Gaussian. A linear dependence
of edge delays on process parameters is also commonly
accepted. Under this model, edge delays also have a Gaussian
distribution. They are correlated because of the simultaneous effect of intra-chip and inter-chip variability patterns.
While intra-chip variability impacts individual transistors
and wires in an uncorrelated manner, inter-chip variability
causes all edges of the timing graph to vary in a statistically
correlated manner. Probabilistic timing graphs have been
studied by the operations research community in the context of PER T (program evaluation and review technique) networks.
15 Earlier work, however, was largely concerned with
the analysis of graphs with independent edge delay distributions. Even in this case, the exact distribution of graph propagation times cannot be generally obtained because arrival
times become correlated due to shared propagation history.
The problem addressed by the EDA community is different in several respects: edge delays are correlated, estimating manufacturing yield requires capturing the entire
cumulative distribution function (cdf ) of duration time and
not just the moments, and the network size is much larger
(often with millions of nodes). Recent work in EDA has fallen
into three categories: Monte-Carlo simulation,
6, 9
numerical integration methods in process parameters space,
8 and
analytical techniques operating in edge delays space.
3, 17, 18
Monte-Carlo approaches estimate the distribution by repeatedly evaluating the deterministic STA algorithm, with delay
values sampled from their distributions. These algorithms
tend to be computationally expensive since thousands of
runs may be needed to achieve reasonable accuracy. While
being accurate, numerical integration methods have prohibitively high runtimes which limits them to only a very
small number of independent process parameters.
It is not feasible to analytically compute the exact circuit
delay distribution under correlated edge delays even under
simple correlation structures.
12 This has led to solutions
that either approximate3, 17 or bound1, 18 the distribution.
Approximations are typically used in the so-called node-based methods that involve approximation of the node
arrival time distributions. This allows for statistical timing
evaluation in a way similar to the traditional topological
longest path algorithm, in which each node and edge of a
timing graph is explored in a topological order and is visited
once in a single traversal of the graph.
3, 17 It thus preserves
the linear runtime complexity of deterministic STA.
The computation of the node arrival time requires adding
edge delays and then taking the maximum of edge delays. For
two Gaussian variables, the sum is also Gaussian but the maximum Z = max(A, B) is not. Because node-based algorithms
propagate arrival times through the timing graph, the edge
delays and arrival times must be represented in an invariant
manner when applying multiple add and max operators. The
solution proposed in Jacobs et al.
7 is to approximate the result
of the max computation at each node by a new Gaussian random variable. This is based on matching the first two moments
of the exact distribution of Z with a new normal random variable. The method is efficient because the mean and variance
of the maximum of two normal random variables can be computed in closed form.
4 In a timing graph with correlated edge
delays, the new random variable must also reflect the degree
of correlation it has with any other edge. For the known correlation between delay of edges A, B, and D, the correlation
between max(A, B) and D can be computed in closed form.
4
Instead of storing the full correlation matrix directly, a linear
model of delay as an explicit function of process parameters
can be used to capture the correlation.
3
If each edge in the graph adds some independent random
variation to the total delay, then capturing arrival time correlation due to common path history requires storing information on every traversed edge. The can become overwhelming,
however, because the number of edges may be quite large.
A solution proposed in Chang and Sapatnekar3 uses a dimensionality reduction technique of principal component analysis (PCA). If random components of delay contributed by
timing edges belonging to the same region of the chip behave
in a spatially correlated manner, then PCA is effective in
reducing the number of terms to be propagated. An alternative strategy proposed in Visweswariah et al.
17 is to lump the
contributions of all random variation into a single delay term
and represent the node arrival times and edge delays as
ao + ∑
i =
1
n
ai D Pi + an +
1 D Ra,
where Pi is the ith inter-chip variation component,
DPi = Pi − Po is its deviation from the mean value, ai is the sensitivity of the gate or wire delay to the parameter Pi, DRa is a
standard normal variable, and the an + 1 coefficient represents
the magnitude of the lumped random intra-chip component
of delay variation. Representing random delay variation in
this manner clearly cannot capture the correlation of arrival
times due to signal paths sharing common ancestor edges.
Yet keeping the path history makes the node-based algorithms less efficient, and known industrial applications17
tend to ignore the impact of path reconvergence by relying
on the simple delay model shown above.
For Gaussian process parameter variations, the linearized
delay models described above permit efficient add and max
operations and propagate the result in the same functional
form. For the max operation, the result is mapped onto the
canonical model using a linear approximation of the form
MAX(A, B) ≈ C = TAA + ( 1 − TA)B, where TA = P(A > B) is known as
the tightness probability.
4. s TATIs TIcAL s TA usInG BounDs
When computing the distribution of circuit delays, the
quality of approximation degrades when applied to paths