maintenance: traditional, never, discrete, and continuous—or, perhaps,
war, famine, plague, and death. In any
case, 3. 5 of them are terrible ideas.
Traditional (or “everyone’s first
project”). This one is easy: don’t even
think about the possibility of maintenance. Hard-code constants, avoid
subroutines, use all global variables,
use short and non-meaningful variable names. In other words, make it
difficult to change any one thing without changing everything. Everyone
knows examples of this approach—
and the PHBs who thoughtlessly push
you into it, usually because of schedule pressures.
Trying to maintain this kind of software is like fighting a war. The enemy
fights back! It particularly fights back
when you have to change interfaces,
and you find you’ve only changed
some of the copies.
Never. The second approach is to
decide upfront that maintenance will
never occur. You simply write wonderful programs right from the start. This
is actually credible in some embedded
systems, which will be burned to ROM
and never changed. Toasters, video
games, and cruise missiles come to
mind.
All you have to do is design perfect specifications and interfaces,
and never change them. Change only
the implementation, and then only
for bug fixes before the product is
released. The code quality is wildly
better than it is for the traditional approach, but never quite good enough
to avoid change completely.
Even for very simple embedded sys-
tems, the specification and designs
aren’t quite good enough, so in practice the specification is frozen while
it’s still faulty. This is often because it
cannot be validated, so you can’t tell if
it’s faulty until too late. Then the specification is not adhered to when code
is written, so you can’t prove the program follows the specification, much
less prove it’s correct. So, you test until the program is late, and then ship.
Some months later you replace it as a
complete entity, by sending out new
ROMs. This is the typical history of
video games, washing machines, and
embedded systems from the U.S. Department of Defense.
Discrete. The discrete change approach is the current state of practice: define hard-and-fast, highly
configuration-controlled interfaces
to elements of software, and regularly
carry out massive all-at-once changes.
Next, ship an entire new copy of the
program, or a “patch” that silently
replaces entire executables and libraries. (As we write this, a new copy
of Open Office is asking us please to
download it.)
In theory, the process accepts (
reluctantly) the fact of change, keeps a
parts list and tools list on every item,
allows only preauthorized changes
under strict configuration control,
and forces all servers’/users’ changes
to take place in one discrete step. In
practice, the program is running multiple places, and each must kick off
its users, do the upgrade, and then let
them back on again. Change happens
more often and in more places than
predicted, all the components of an
Real-world structure for managing interface changes.
struct item_loc_t {
struct {
unsigned short major; /* = 1 */
unsigned short minor; /* = 0 */
} version;
unsigned part_no;
unsigned quantity;
struct location_t {
char state[ 4];
char city[ 8];
unsigned warehouse;
short area;
short pigeonhole;
} location;
...
item are not recorded, and patching is
alive (and, unfortunately, thriving) because of the time lag for authorization
and the rebuild time for the system.
Furthermore, while official interfaces are controlled, unofficial interfaces proliferate; and with C and
older languages, data structures are
so available that even when change is
desired, too many functions “know”
that the structure has a particular
layout. When you change the data
structure, some program or library
that you didn’t even know existed
starts to crash or return ENOTSUP.
A mismatch between an older Linux
kernel and newer glibc once had
getuid returning “Operation not
supported,” much to the surprise of
the recipients.
Experience shows that it is completely unrealistic to expect all users
to whom an interface is visible will be
able to change at the same time. The
result is that single-step changes cannot happen: multiple change interrelationships conflict, networks mean
multiple versions are simultaneously
current, and owners/users want to
control change dates.
Vendors try to force discrete changes, but the changes actually spread
through a population of computers
in a wave over time. This is often likened to a plague, and is every bit as
popular.
Customers use a variant of the
“never” approach to software maintenance against the vendors of these
plagues: they build a known working configuration, then “freeze and
forget.” When an update is required,
they build a completely new system
from the ground up and freeze it. This
works unless you get an urgent security patch, at which time you either
ignore it or start a large unscheduled
rebuild project.
Continuous change. At first, this approach to maintenance sounds like
just running new code willy-nilly and
watching what happens. We know at
least one company that does just that:
a newly logged-on user will unknowingly be running different code from
everyone else. If it doesn’t work, the
user’s system will either crash or be
kicked off by the sysadmin, then will
have to log back on and repeat the
work using the previous version.