Design
Phase
Insert recorders
inside chip design
Post-Si
Validation
No
Run tests
Record footprints
in recorders
Non-instrusive
No failure reproduction
(Single test run sufficient)
Failure
detected?
Yes
Scan out recorder
contents
Post-analyze offline
Localized Bug: (location, stimulus)
No system simulation
(Self-consistency against
test program binary)
IFRA, an acronym for Instruction Footprint Recording
and Analysis, targets bug localization in processors. Figure
1 shows IFRA-based post-silicon bug localization flow.
During chip design, a processor is augmented with low-cost hardware recorders (Section 2) for recording
instruction footprints, which are compact pieces of information
describing the flows of instructions (i.e., where each instruction was at various points of time), and what the instructions did as they passed through various design blocks of
the processor. During post-silicon bug detection, instruction footprints are recorded in each recorder, concurrently
with system operation, in a circular fashion to capture the
last few thousand cycles of history before a failure.
Upon detection of a system failure, the recorded footprints are scanned out through a Boundary-scan interface,
which is a standard interface present in most chips for testing purposes. Since a single run up to a failure is sufficient
for IFRA to capture the necessary information (details in
Section 2), failure reproduction is not required for localization purposes.
The scanned-out footprints, together with the test-program binary executed during post-silicon bug detection,
are post-processed off-line using special analysis techniques
(Section 3) to identify the microarchitectural block with
the bug, and the instruction sequence that exposes the bug
(i.e., the bug exposing stimulus). Microarchitectural block
boundaries are defined specifically for IFRA. Examples
include instruction queue control, scheduler, forwarding
path, decoders, etc. IFRA post-analysis techniques do
not require any system-level simulation because they rely
on checking for self-consistencies in the footprints with
respect to the test-program binary.
Once a bug is localized using IFRA, existing circuit-level
debug techniques4, 9 can then quickly identify the root cause
of bugs, resulting in significant gains in productivity, cost,
and time-to-market.
In this paper, we demonstrate the effectiveness of IFRA
for a DEC Alpha 21264-like superscalar processor model6
because its architectural simulator2 and RTL model24
are publicly available. Such superscalar processors con-
tain aggressive performance-enhancement features (e.g.,
execution of multiple instructions per cycle, execution of
instructions out of program order, and prediction of branch
targets and outcomes) that are present in many commercial
high-performance processors. 22 Such features significantly
complicate post-silicon validation. For simpler in-order
processors (e.g., ARMv6, Intel Atom, SUN Niagra cores),
IFRA can be significantly simplified.
1. For 75% of injected electrical bugs, IFRA pinpointed
their exact location ( 1 out of 200 microarchitectural
blocks) and the time they were injected ( 1 out of over
1,000 cycles)—referred to as location–time pair. For
21% of injected bugs, IFRA correctly identified their
location–time pairs together with 5 other candidates
(out of over 200,000 possible pairs) on average. IFRA
completely missed correct location–time pairs for
only 4% of injected bugs.
2. The aforementioned results were obtained without relying on system-level simulation and failure reproduction.
3. IFRA hardware introduces a very small area impact of
1% (dominated by on-chip memory for storing 60KB
of instruction footprints). If on-chip trace buffers1
already exist for validation purposes, they can be
reused to reduce the area impact. Alternatively, a part
of data cache may also be used to reduce the area
impact of IFRA.
Related work on post-silicon validation can be
broadly classified as formal methods, 5 on-chip
trace buffers for hardware debugging, 1 off-chip
program and data tracing, 13 clock manipulation, 9
scan-aided techniques, 4 check-pointing with deterministic replay, 21 and online assertion checking. 1, 3
Table 1 presents a qualitative comparison of IFRA vs. existing
post-silicon bug localization techniques. In Table 1, a
technique is categorised as being intrusive if it can alter
the functional/electrical behavior of the system which
may prevent electrical bugs to get exposed.
Section 2 describes hardware support for IFRA. Section 3
describes off-line analysis techniques performed on the
scanned-out instruction footprints. Section 4 presents simulation results, followed by conclusions in Section 5.
2. ifRA hARDWARe suPPoRt
The three hardware components of IFRA’s recording infra-structure, for a superscalar processor, are indicated as
shaded parts in Figure 2.
1. A set of distributed recorders, denoted by ‘R’ in
Figure 2, with dedicated circular buffers. As an instruction passes through a pipeline stage, the recorder
associated with that stage records information specific to that stage (Table 2). When no instruction
passes through a pipeline stage for many cycles, consecutive idle cycles are compacted into a single entry
in the corresponding recorder.