pens. We might simply “monkey test”
by firing all manner of random data at
the system. In all these cases we do not
know a priori what will happen. We are
looking for something, but we do not
quite know what it is.
not testing for 1oi
First Order Ignorance (1OI) occurs
when we do not know something but we
are fully aware of what we do not know.
We should never test for 1OI. If we truly
knew in advance what we did not know
and where the limitations of our system’s knowledge lie, we would resolve
our ignorance first, incorporate what we
have learned into the system, and then
conduct a 0OI clean test to prove it.
a successful test
Here we see the dichotomy of testing:
for 0OI a “successful test” does not
expose any new knowledge, it simply
proves our existing knowledge is correct. On the other hand, a 2OI test does
expose new knowledge, but usually
only if it “fails.” The two yardsticks for
success in testing: passing and failing
tests are focused on these two different
targets. This is partly where the tension
between exposing and not exposing defects comes from. While having defects
in our system is clearly a bad thing,
finding them in testing (versus not
finding them) is equally clearly a good
thing. As long as there aren’t too many.
how much of a Good thing?
For our 0OI testing 100% passing is the
goal. Any test that “fails” indicates the
presence of 2OI in the system (or the
test setup or possibly sloppiness in
testing, which is a different, Third Order Ignorance, process kind of failure).
For 0OI testing, the ideal situation is
that every bit of knowledge we baked
into our system is correct and the successful tests simply prove it.
But what about the 2OI tests? Logic
would suggest that a set of 2OI tests
that exposed no defects at all would not
be a good test run since nothing new
would be learned.a It is possible that
a test run that exposed no defects at
a Information theory does assert that the knowledge content of a system is increased by a 2OI
test that “passes”—specifically it assures that
the system will not throw an error under the
specific conditions the test employed and provides some assurance for certain related tests.
all shows the system is very, very good
and there are no lapses in the knowledge that we built into the system. But
this is unusual and most testers would
be very suspicious if they saw no defects at all, especially early in a testing
cycle. Logic aside, emotion would suggest that a set of 2OI tests that exposed
100% errors would also not be a “good”
test run. While finding such a huge
number of errors might be better than
not finding them, it indicates either a
criminally poor system or a criminally
poor test setup. In the second case the
knowledge we acquire by testing relates
to how we conduct tests that might be
easily learned and fixed. In the poor system case our ignorance is in the system
being tested and it may indicate an awful lot of rework in the requirements,
design, and coding of the system. This
is not an effective use of testing. Indeed, it might be that we are actually using testing as a (very inefficient) way to
gather requirements, design, and code
the system since the original processes
clearly did not work.
So if 0% defect detection is too low,
and 100% defect detection is too high,
what is the right number? Well, it
would be somewhere between 0% and
100%, right? To find where this sweet
spot of defect detection is, we need to
look back to 1928.
transmission of information
In 1928, Ralph Hartley, a telephone
engineer at Bell Labs, identified a
mechanism to calculate the theoretical
information content of a signal.
3 If we
think of the results of a test run as sig-
nals containing information about the
system that are transmitted from the
system under test to the tester, at what
point is this information maximized?
Hartley showed the information con-
tent of a signal is proportional to the
logarithm of the probability that the
event occurs. Viewing a test result as a
simple binary—a test throws a defect
(“failure”) or a test does not throw a de-
fect (“success”)—the information con-
tent returned is given by the equation
shown in Figure 1, where Pf = probabil-
ity of failure (error is detected); Ps = prob-
ability of success (error is not detected).
4
The graph of this function is shown in
However, since the possible 2OI test set is func-
tionally infinite, this assurance is not strong.
Figure 2. In the simple binary view the
maximum amount of information is
returned when tests have a 50% probability of success (no error thrown). At
that point, of course, they also have a
50% probability of failure.
managing test complexity
This gives testers a metric by which to
design test suites. If a set of tests does
not return enough defects we should
increase the test complexity until we
see about half the tests throw errors. We
would generally do this by increasing
the number of variables tested between
tests and across the test set. Contrariwise, if we see that more than 50% of
the test cases expose defects, we should
back off the complexity until the failure
rate drops to the optimal level.
This optimization is ideal for
knowledge acquiring (2OI) tests. For kno wledge
proving (0OI) tests, the ideal is 100%
pass rate. The problem is, we do not
know in advance that (what we think
is) a 0OI test won’t expose something
we were not expecting. And sometimes
what is exposed in a 0OI test is really
important, especially since we weren’t
expecting it. Still, as we migrate testing from discovery to proof we should
expect that the failure rate will switch
from 50% to 100%. How this should
happen is a story for another day.
i knew that
I showed this concept to a tester friend
of mine who has spent decades testing
systems. His response: “I knew that.”
He said. “I mean, if no errors is bad
and all errors is bad, of course a good
answer is some errors in the middle
and 50% is in the middle, right? I don’t
need an 80-year-old logarithmic for-
mula derived from telegraphy informa-
tion theory to tell me that.”
Hmm, it seems the unconscious art
of software testing is alive and well.
References
1. armour, p.g. The Laws of Software Process. auerbach
publishers, boca raton, fl, 2004, 7–10.
2. armour, p.g. the unconscious art of software testing.
Commun. ACM 48, 1 (Jan. 2004).
3. hartley, r.v.l. transmission of information. Bell
Systems Technical Journal, 1928.
4. reinertsen, d.g. The Principles of Product
Development Flow. celeritas publishing, redondo
beach, ca, 2009, 93.
Phillip G. Armour ( armour@corvusintl.com) is a senior
consultant at corvus international inc., deer park, il,
and a consultant at QsM inc., Mclean, va.