Most likely, some middle ground
will be identified that allows compiler
optimizations but doesn’t eliminate all
guarantees for the programmer. One
possibility is the introduction of a
wobbly value that would allow uninitialized
objects to change values without requiring this to be undefined behavior.
Trap representations are an oddity, because they were introduced to
help diagnose uninitialized reads
but are now viewed with suspicion by
the safety and security communities,
which are wary that the undefined behavior associated with reading a trap
value is being imparted to reads of indeterminate values.
Related articles
on queue.acm.org
Passing a Language through
the Eye of a Needle
Roberto Ierusalimschy et al.
http://queue.acm.org/detail.cfm?id=1983083
The Challenge of Cross-Language
Interoperability
David Chisnall
http://queue.acm.org/detail.cfm?id=2543971
References
1. Debian Security Advisory. DSA-1571-1 OpenSSL—
Predictable random number generator, 2008; http://
www.debian.org/security/2008/dsa-1571.
2. IEC. Binary floating-point arithmetic for
microprocessor systems (60559:1989).
3. ISO/IEC. Programming languages—C, 3rd ed. (ISO/
IEC 9899:2011). Geneva, Switzerland.
4. Krebbers, R., Wiedijk, F. N1793: Stability of
indeterminate values in C11; http://www.open-std.org/
jtc1/sc22/wg14/www/docs/n1793.pdf.
5. Memarian, K. and Sewell, P. Clarifying the C
memory object model, 2016 (revised version of
WG14 N2012). University of Cambridge; http://
www.cl.cam.ac.uk/~pes20/cerberus/notes64-wg14.
html#clarifying-the-c-memory-object-model-uninitialised-values.
6. Memarian, K., Sewell, P. What is C in practice? 2015
(updated 2016). (Cerberus survey v2): Analysis of
responses (2014)—with comments; https://www.
cl.cam.ac.uk/~pes20/cerberus/notes50-survey-
discussion.html.
7. Open Standards. Optional support for signaling NaNs,
2003; http://www.open-std.org/jtc1/sc22/wg14/www/
docs/ n1011.htm.
8. Peterson, R. Defect report #338. C99 seems to exclude
indeterminate value from being an uninitialized
register. Open Standards, 2007; http://www.open-std.
org/jtc1/sc22/wg14/www/docs/ dr_338.htm.
9. Seacord, R.C. Clarification of unspecified value. Open
Standards, 2016; http://www.open-std.org/jtc1/sc22/
wg14/www/docs/n2042.pdf.
10. Wang, X. More randomness or less; http://kqueue.org/
blog/2012/06/25/more-randomness-or-less/.
11. Wiedijk, F. and Krebbers, R. Defect report #451.
Instability of uninitialized automatic variables. Open
Standards, 2013; http://www.open-std.org/jtc1/sc22/
wg14/www/docs/ dr_451.htm.
Robert C. Seacord is a Principal Security Consultant with
NCC Group, where he works with software developers
and software development organizations to eliminate
vulnerabilities resulting from coding errors before they
are deployed.
Copyright held by owner/author.
Publication rights licensed to ACM. $15.00
undefined behavior on all implementations. This undefined behavior applies even to direct reads of objects of
type unsigned char. The unsigned
char type normally has a special status
in the standard in that values stored in
non-bit-field objects may be copied into
an object of type unsigned char [n]
(for example, by memcpy), where n is
the size of an object of that type.
Sample Programs
The preceding review of trap representations makes it clear the unsigned
char type is the most interesting case.
Consider the following code:
unsigned char f(
unsigned char y
) {
unsigned char x[ 1]; /*unit */
if (x[0] > 10)
return y/x[0];
else
return 10;
}
The unsigned char array x has
automatic storage duration and is consequently uninitialized. Because it is
declared as an array, the address of x is
taken, meaning that the read is defined
behavior. While the compiler could
avoid taking the address, it cannot
change the semantics of the code from
unspecified value to undefined behavior. Consequently, the compiler is not
allowed to translate this code into instructions that might perform a trap.
Objects of unsigned char type are
guaranteed not to have trap values. The
read in this example is defined because
it is from an object of type unsigned
char and known to be backed up by
memory. It is unclear, however, which
value is read and if this value is stable.
From this perspective, it could be argued that this behavior is implicitly
undefined. Minimally, the standard is
unclear and possibly contradictory.
Defect Report #45111 deals with the
instability of uninitialized automatic
variables. The proposed committee re-
sponse to this defect report states that
any operation performed on indeter-
minate values will have an indetermi-
nate value as a result. Library functions
will exhibit undefined behavior when
used on indeterminate values. It is
unclear, however, whether y/x[0] can
result in a trap. Based on the proposed
committee response to Defect Report
#451, for all types that do not have trap
representations, an uninitialized value
can appear to change its value, allow-
ing a conforming implementation to
print two different values.
Consider the following code:
void f(void) {
unsigned char x[ 1]; /*uninit */
x[0] ^= x[0];
printf(“%d\n”, x[0]);
printf(“%d\n”, x[0]);
return;
}
In this example, the unsigned
char array x is intentionally uninitialized but cannot contain a trap representation because it has a character type.
Consequently, the value is both indeterminate and an unspecified value. The
bitwise exclusive OR operation, which
would produce a zero on an initialized
value, will produce an indeterminate
result, which may or may not be zero.
An optimizing compiler has the license
to remove this code because it has undefined behavior. The two printf calls
exhibit undefined behavior and, consequently, might do anything, including
printing two different values for x[0].
Uninitialized memory has been used
as a source of entropy to seed random
number generators in OpenSSL, DragonFly BSD, OpenBSD, and elsewhere.
10
If accessing an indeterminate value is
undefined behavior, however, compilers may optimize out these expressions,
resulting in predictable values.
1
Conclusion
The behavior associated with uninitialized reads is an unsettled issue that
the C Standards Committee needs to
address in the next revision of the standard (C2X). One simple solution would
be to eliminate trap representations
altogether and simply state that reads
of indeterminate values are undefined
behavior. This would greatly simplify
the standard (which itself is of value)
and provide compiler developers with
all the latitude they want to optimize
code. The diametrically opposed solution is to define fully concrete semantics for uninitialized reads in which
such a read is guaranteed to give the
actual contents of memory.