the application already has a substrate
that performs heartbeat protocols (or
any other mechanism that notifies the
application or system when a component fails), fail-over, and redundant
communication channels, then you
will want to exclude those components
from the database management system and hook into the existing functionality. Monolithic systems do not allow this, whereas a component-based,
modular architecture does.
In addition to providing smaller,
simpler applications, components with
well-defined, clean, exposed interfaces
provide for a degree of extensibility that
is simply not possible in a monolithic
system. For example, consider the basic set of components needed to construct a transactional system: a transaction manager, a lock manager, and a
log manager. If these modules are open
and extensible, then the developer can
build systems that incorporate items
that are not managed by the database
system into transactions. Consider, for
example, a network switch: the state of
the configuration database depends on
the state of hardware inside the device,
and vice versa. If the electrical control
over chips and boards can be incorporated into transactions, by allowing the
programmer to extend the locking and
logging system to communicate with
them, then operations such as “power
up the backup network interface card”
can be made transactional.
Modularity is a powerful tool for
managing size and complexity of applications and systems while also enabling
the application and data management
capabilities to seamlessly interact.
Thus, we have proposed an architecture that enables developers to exclude
functionality they do not need and include functionality they do need but is
not provided by the database vendor.
configurability
ILLUS TRATION B Y CELIA JOHNSON
The second property of a flexible data
management system is configurability.
Whereas modularity is an architectural
mechanism, configuration is mostly a
runtime mechanism. With a component-based architecture, the build-time
configuration is involved in selecting
appropriate components. A single collection of components may still run on
a range of systems with wildly different
capabilities. For example, just because
two applications both want transactions and B-trees, this does not mean
that both can support a multi-gigabyte
in-memory cache. The ability to adapt
to radically different circumstances is
critical. Configurability refers to how
well a system can be matched to its environment and application needs. In
this article we discuss configurability
with respect to the hardware, the environment in which the application runs
(for example, the operating system),
the application’s software architecture,
and the “natural” data format of the application.
Hardware environments introduce
variability in CPU speed, memory size,
and persistent storage capabilities.
Variability in CPU speed and persistent storage introduces the possibility
of trading computation for disk bandwidth. On a fast processor, it may be
beneficial to compress data, consuming CPU cycles, in order to save I/O;
on a PDA, where CPU cycles are sparse
and persistent I/O is fast, compression
might not be the right trade-off.
In a world where resource-con-strained devices require potentially sophisticated data management, developers must have control over the memory
and disk consumption policies of the
database. In different environments,
applications may need control over the
maximum size of in-memory data structures, the maximum size of persistent
data, and the space consumed by transactional logs. Policies for consumption of these resources must be set by
the application developer, not the end
user, since the developer is more likely
to have the technical savvy necessary to
make the right decisions.
Variability in persistent storage
technologies places new demands
on the database engine as well. Not
only must it work well in the presence
of spinning, magnetic storage, but it
should also run well on other media
(for example, flash) with constraints on
behaviors (such as the number of writes
to a particular memory location), and it
may need to run in the absence of any
persistent storage. For example, some
applications want to manage data entirely in main memory, with no persistence; some want to manage data
with full synchronous transactional
guarantees on updates; and some need
something in the middle. Each of these
policies should be implemented by
the same transactional component,
but the database should allow the programmer to control whether or not data
persists across power-down events and
the strictness of any transactional assurances that the system makes to the
end user.
Although many embedded systems
are now able to use commodity off-the-shelf hardware platforms, many proprietary devices still exist. The ubiquitous data management solution will be
portable to these special-purpose hardware devices. It will also be portable to a
variety of operating systems as well; the
services available from the operating
system on a mobile telephone handset
are different from those available on a
64-way multiprocessor with gigabytes
of RAM, even if both are running Linux.
If the data management system is to
run everywhere, then it must rely only
on the services common to most oper-