ρρρ ι ι ι ι
The first condition requires the target to be defined whenever
the source is defined. The second condition requires the tar-
get to be poison-free whenever the source is defined and poi-
son-free. The third condition requires the source and target
to compute the same result when the source is defined and
poison-free. The constraints are only required to hold if the
target represents a set of values which must all be refine-
ments of the source. Poison values are handled by ensur-
ing that an instruction in the target template will not yield
a poison value when the source instruction did not, for
any specific choice of input values. In summary, we check
correctness by checking ( 1) the target is defined when the
source is defined, ( 2) the target is poison-free when the
source is poison-free, and ( 3) the target produces a subset
of the values produced by the source when the source is
defined and poison-free.
= ∧∧ ÷ ×=
a b a ub b a
where ÷u is unsigned bitvector division. The unsigned division requires the second argument to be non-zero, and the
exact attribute requires %a to be divisible by %b.
To determine whether these conditions hold, we ask
an SMT solver to find cases where they are violated. When
found, these counter-examples are reported to the user, as
shown in Figure 3. Conversely, if the SMT solver can show
that no counter-example exists, then the conditions must
hold and the optimization is valid.
Encoding undef values. Undef values represent sets of possible values. The VC Gen encodes them as fresh SMT variables,
which are collected in a set U. In particular, each reference to
an undef value, direct or indirect, must receive a fresh SMT
variable. The sets collected for the source and target will then
be appropriately quantified over the correctness conditions.
3. 1. Verification condition generation
Encoding preconditions. Alive’s precondition sublan-guage provides comparison operators and a set of named
predicates, along with conjunction, disjunction, and negation. Aside from the predicates, these have a straightforward
encoding in SMT.
Alive’s Verification Condition Generator (VC Gen) encodes
the values, instructions, and expressions in a transformation into SMT expressions using the theory of bitvectors.
The correspondence between LLVM operations and bitvector
logic is very close, which makes the encoding straightforward.
For each instruction, the interpreter computes 3 SMT expressions: ( 1) an expression ι for the result of the instruction, ( 2)
an expression δ indicating whether the instruction is defined,
and ( 3) an expression ρ indicating whether the result is free of
poison. The first has a type corresponding to the return type of
the instruction. The others are Boolean predicates. All 3 may
contain free variables, corresponding to the uninterpreted
input variables and symbolic constants in the optimization.
The encoding of named predicates depends on whether
the underlying analysis is precise or is an over- or under-approximation. For example, the predicate isPower2 is
implemented in LLVM with a must-analysis, that is, when
isPower2(%a) is true, we know for sure that %a is a power
of 2; when it is false, no inference can be made. The VC Gen
encodes the result of isPower2(%a) using a fresh Boolean
variable p, and a side constraint p ⇒ a ≠ 0 ∧ a & (a − 1) = 0.
Typically, an instruction’s result is encoded by applying
the corresponding bitvector operation to the encoding of
its arguments. Its definedness and poison-free conditions
are the conjunction of the definedness and poison-free
conditions, respectively, of its arguments along with any
specific requirements for that instruction.
The encoding of may-analyses is similar. The VC Gen creates a fresh variable p to represent the result of the analysis
and a side constraint of the form s ⇒ p where s is an expression summarizing the may-analysis based on the inputs.
For example, a simplified encoding of mayAlias(%a,%b)
is a = b ⇒ p.
Most analyses in LLVM are precise when their inputs are
compile-time constants. Therefore, we encode the result of
these analyses precisely when we detect such cases (done
statically by the VC Gen).
For example, consider the instructionudiv exact
3. 2. Correctness criteria
%a,%b, which is encoded as follows,
Figure 3. Alive’s counterexample for the incorrect transformation
reported as LLVM PR21245.
Let φ be the encoding of the precondition, let ιS and ι T be the
values computed by the source and target, respectively, and
similarly let δS, δ T, ρS, and ρ T be the definedness and poison-free conditions.
Pre: C2 (1 << C1) == 0
%s = shl nsw %X, C1
%r = sdiv %s, C2
%r = sdiv %X, C2 / (1 << C1)
Let I be the set of input variables, P be the Boolean variables used to encode analyses, and US and U T be the sets of
variables used to encode undef values in the source and target, respectively.
An Alive optimization is correct if and only if the following conditions hold for every feasible type assignment:
ERROR: Mismatch in values of i4 %r
%X i4 = 0xF (15, –1)
C1 i4 = 0x3 (3)
C2 i4 = 0x8 (8, -8)
%s i4 = 0x8 (8, -8)
Source value: 0x1 (1)
Target value: 0xF (15, –1)