not reject has been certified as “C code”
no matter how blatantly illegal its contents may be to a language scholar. Fed
this illegal not-C code, a tool’s C front-end will reject it. This problem is the
tool’s problem.
Compounding it (and others) the
person responsible for running the
tool is often not the one punished if the
checked code breaks. (This person also
often doesn’t understand the checked
code or how the tool works.) In particular, since our tool often runs as part of
the nightly build, the build engineer
managing this process is often in charge
of ensuring the tool runs correctly.
Many build engineers have a single concrete metric of success: that all tools terminate with successful exit codes. They
see Coverity’s tool as just another speed
bump in the list of things they must get
through. Guess how receptive they are
to fixing code the “official” compiler accepted but the tool rejected with a parse
error? This lack of interest generally extends to any aspect of the tool for which
they are responsible.
Many (all?) compilers diverge from
the standard. Compilers have bugs. Or
are very old. Written by people who misunderstand the specification (not just
for C++). Or have numerous extensions.
The mere presence of these divergences
causes the code they allow to appear.
If a compiler accepts construct X, then
given enough programmers and code,
eventually X is typed, not rejected, then
encased in the code base, where the
static tool will, not helpfully, flag it as a
parse error.
The tool can’t simply ignore divergent code, since significant markets
are awash in it. For example, one enormous software company once viewed
conformance as a competitive disadvantage, since it would let others make
tools usable in lieu of its own. Embedded software companies make great
tool customers, given the bug aversion
of their customers; users don’t like it if
their cars (or even their toasters) crash.
Unfortunately, the space constraints in
such systems and their tight coupling
to hardware have led to an astonishing
oeuvre of enthusiastically used compiler extensions.
Finally, in safety-critical software
systems, changing the compiler often
requires costly re-certification. Thus,
we routinely see the use of decades-
old compilers. While the languages
these compilers accept have interest-
ing features, strong concordance with
a modern language standard is not one
of them. Age begets new problems.
Realistically, diagnosing a compiler’s
divergences requires having a copy of
the compiler. How do you purchase a
license for a compiler 20 versions old?
Or whose company has gone out of
business? Not through normal chan-
nels. We have literally resorted to buy-
ing copies off eBay.
// “redefinition of parameter ’a’”
void foo(int a, int a);
The programmer names foo’s first
formal parameter a and, in a form of
lexical locality, the second as well.
Harmless. But any conformant compiler will reject this code. Our tool certainly did. This is not helpful; compil-ing no files means finding no bugs, and
people don’t need your tool for that.
And, because its compiler accepted it,
the potential customer blamed us.
Here’s an opposite, less-harmless
case where the programmer is trying to
make two different things the same
typedef char int;
(“Useless type name in empty declaration.”)
And one where readability trumps
the language spec
unsigned x = 0xdead _ beef;
(“Invalid suffix ‘_beef’ on integer
constant.”)
From the embedded space, creating
a label that takes no space
void x;
(“Storage size of ‘x’ is not known.”)
Another embedded example that
controls where the space comes from
unsigned x “text”;
(“Stray ‘@’ in program.”)
A more advanced case of a nonstandard construct is
Int16 ErrSetJump(ErrJumpBuf buf)
= { 0x4E40 + 15, 0xA085; }
It treats the hexadecimal values of
machine-code instructions as program
source.
The award for most widely used extension should, perhaps, go to Microsoft support for precompiled headers.
Among the most nettlesome troubles
is that the compiler skips all the text
before an inclusion of a precompiled
header. The implication of this behavior is that the following code can be
compiled without complaint:
I can put whatever I want here.
It doesn’t have to compile.
If your compiler gives an error,
it sucks.
#include <some-precompiled-header.h>
Microsoft’s on-the-fly header fabrication makes things worse.
Assembly is the most consistently
troublesome construct. It’s already
non-portable, so compilers seem to
almost deliberately use weird syntax, making it difficult to handle in a
general way. Unfortunately, if a programmer uses assembly it’s probably
to write a widely used function, and
if the programmer does it, the most
likely place to put it is in a widely used