Unfortunately, the software had been
designed to meet a requirement that
reverse thrust be disabled unless wheel
pulses were being received (indicating
that the wheels were turning and thus
in contact with the ground). Because of
rain on the runway, the aircraft aquaplaned when it touched down, and the
wheels did not turn, so the software
dutifully disabled reverse thrust and
the aircraft overran the runway. Had
the critical property been expressed
in terms of being on the ground rather
than receiving wheel pulses, the invalid
assumption that they were equivalent
may have been scrutinized more carefully and the flaw detected. (This account is simplified; for the full incident
report see Ladkin26).
This view of requirements is due
to Michael Jackson23 and has been adopted by Praxis in its REVEAL requirements engineering method.
specialized variants of the idea have
appeared before, most notably in David Parnas’s Four Variable Model.
The dependability case. The evidence
for dependability takes the form of
a dependability case—an argument
that the software, in concert with other
components, establishes the critical
properties. What exactly comprises the
case—such as how detailed it should
be and what mix of formal and informal arguments is appropriate —will
vary between developments, but certain features are essential.
First, the case should be auditable
so that it can be evaluated by a third-party certifier, independent both of
developer and customer. The effort of
checking that the case is sound should
be much less than the effort of building the case in the first place. In this
respect, a dependability case may be
like a formal proof: hard to construct
but easy to check. To evaluate a case,
a certifier should not need any expert
knowledge of the developed system or
of the particular application, although
it would be reasonable to assume expertise in software engineering and familiarity with the domain area.
Second, the case should be complete.
This means that the argument that the
critical properties apply should contain no holes to be filled by the certifier. Any assumptions that are not justified should be noted so that it is clear
to the certifier who will be responsible
for discharging them. For example, the
dependability case may assume that a
compiler generates code correctly; or
that an operating system or middleware platform transports messages
reliably, relying on representations by
the producers of these components
that they provide the required properties; or that users obey some protocol,
relying on the organization that fields
the system to train them appropriately.
For a product that is not designed with
a particular customer in mind, the assumptions become disclaimers, for
example, that an infusion pump may
fail under water or that a file synchronizer will work only if applications do
not subvert file modification dates. Assumptions made to simplify the case,
and that are no more easily substantiated by others, are suspect. Suppose
an analysis of a program written in C,
for example, contains an assumption
that array accesses are within bounds.
If this assumption cannot readily be
checked, the results of the analysis
cannot be trusted.
Third, the case should be sound. It
should not, for example, claim full correctness of a procedure on the basis of
nonexhaustive testing; or make unwarranted assumptions that certain components fail independently; or reason,
in a program written in a language with
a weak memory model that the value
read from a shared variable is the value
that was last written to it.
On the face of it, these recommendations—that developers express the
critical properties and make an explicit
argument that the system satisfies
them—are hardly remarkable. If followed, however, they would have profound implications for how software is
procured, developed, and certified.
Dependability case as product. In
theory, one could construct a dependability case ex post facto, when the entire development had been completed.
In practice, however, this would be
near impossible and, in any case, undesirable. Constructing the case is easier and more effective if done hand-in-hand with other development activities,
when the rationale for development
decisions is fresh and readily available.
But there is a far more important reason to consider the dependability case
from the very outset of development.
By focusing on the case, the developer
can make decisions that ease its construction, most notably by designing
the system so that critical properties
are easier to establish. Decoupling and
simplicity, discussed later, offer perhaps the greatest opportunities here.
This is the key respect in which
the direct approach to dependability demands a sea change in attitude.
Rather than just setting in place some
practices or disciplines that are intended to improve dependability, the
developers are called upon, every step
of the way, to consider their decisions
in the light of the system’s dependability and to view the evidence that
these decisions are sound as a work
product that is as integral to the final
system as the code itself.
Procurement. A change to a direct approach affects not only developers but
also procurers, and the goals set at the
start must be realistic in terms both of
their achievement and demonstration.
The Federal Aviation Administration
specified three seconds of downtime
per year for the infamous Advanced Automation System for air-traffic control
(which was ultimately canceled after
an expenditure of several billion dollars), even though it would have taken
10 years just to obtain the data for substantiating such a requirement. It was
later revised to five minutes.
More fundamentally, however, our
society as a whole needs to recognize
that the enormous benefits of software inevitably bring risks and that
functionality and dependability are
inherently in conflict. If we want more
dependable software, we will need to
stop evaluating software on the basis
of its feature set alone. At the same
time, we should be more demanding,
and less tolerant of poor-quality software. Too often, the users of software
have been taught to blame themselves
for its failures and to absorb the costs
After the failure of the USS
Yorktown’s onboard computer system, in
which the ship’s entire control and
navigation network went down after an
officer calibrating a fuel valve entered
a zero into a database application (in
an attempt to overwrite a bad value
that the system had produced), blame
was initially placed on software. After
APriL 2009 | voL. 52 | no. 4 | communicAtionS of the Acm