SimPL: An Algorithm for
Placing VLSI Circuits
VLSI placement optimizes locations of circuit components
so as to reduce interconnect. Formulated in terms of (hyper)
graphs, it is NP-hard, and yet must be solved for challenging
million-node instances within several hours. We propose
an algorithm for large-scale placement that outperforms
prior art both in runtime and solution quality on standard
benchmarks. The algorithm is more straightforward than
existing placers and easier to integrate into timing-closure
flows. Our C++ implementation is compact, self-contained
and exploits instruction-level and thread-level parallelism.
Due to its simplicity and superior performance, the algorithm has been adopted in the industry and was extended
by several university groups to multi-objective optimization.
1. in TRoDuCTion
The first algorithms for circuit placement have been
developed at Bell Labs and IBM Research in the 1960s and
followed the divide-and-conquer paradigm. They motivated
high-performance heuristics for balanced graph-partitioning
by Kernighan and Lin and, later, by Fiduccia and Mattheyses,
that minimize edge cut. In the mid-1980s, circuit placement was a key application of the newly invented Simulated
Annealing methods. Fifteen years later, the number of components in leading chips grew to the point where annealing was much too slow. The divide-and-conquer framework
temporarily regained leadership when it was combined with
bottom-up clustering and multi-level partitioning. However,
in the 2000s, increasing transistor density again demanded
faster algorithms with better performance. Linear programming and network flows were tried with limited success.
Placement optimization gradually became more significant in chip design over the years because the amount
of interconnect grows faster than the number of components (except for grid-like circuits such as memory blocks).
On-chip interconnect now occupies greater volume than
transistors and consumes much power. Additionally, transistor delays improve faster than interconnect delay, which
today limits the speed of many chips. This is why circuit
placement has recently been integrated with more comprehensive optimizations that can reduce interconnect
by restructuring the circuit. 1 But such optimizations need
initial component locations that minimize edge lengths.
This puts an easy-to-formulate graph problem at the core of
sophisticated industrial optimizations. For details the readers are referred to Chapters 4 and 8 of Kahng et al. 12
Modern techniques for VLSI placement approximate
interconnect length by differentiable functions and draw on
efficient numerical optimizations. Such global placement
tolerates various geometric misalignments and small overlaps
between rectangular components (represented by graph
nodes), which are subsequently repaired by combinatorial
algorithms for legalization and detailed placement. Despite
impressive improvements reported by researchers15 and
industry software in the last decade, global-placement algo-
rithms suffer several key shortcomings: (i ) speed, (ii ) solution
quality, (iii ) simplicity and integration with other optimiza-
tions, and (iv) support for multi-threaded execution.