cultural assumptions that may be as inscrutable as those posed to the parents
of teenagers. Similar challenges pervade message and document schema.)
Who’s the dog and who’s the tail?
When two organizations try to communicate, there is always an economic dog and an economic tail. The dog
wags the tail and the tail moves. In
messaging and/or document definition, it is the economic dog that defines the semantics. If there is any
ambiguity, the onus remains on the
economic tail to work it out.
Walmart, for example, is a dominant
force in retailing and dictates many
things to manufacturers that want to
sell through it. In addition to packaging and labeling standards, Walmart
imposes messaging standards for communication with the manufacturers.
Walmart prescribes the meaning, and
the manufacturers adapt.
Schema versus name/value.
Increasingly, schema definition is captured in the “name” of a name/value
pair. This can be seen in the move
from a SQL DDL (which is intended to
be a tight and prescriptive statement
of the meaning of the data) to XML
(which is much more intended as the
author’s description of what was written in the message or document).
Name/value pairs (and their hierarchical cousins in XML, JSON, and the
like) are becoming the standards for
data interchange.
We are devolving from schema to
name/value pairs. The transition away
from strict and formal typing is causing a loss of correctness. Bugs that
would have been caught by a stricter
description can now squeeze through.
On the other hand, we are evolving
from tightly defined prescriptive schema to the more adaptable, flexible, and
extensible name/value pairs. In very
large, loosely coupled systems, adaptability and flexibility seem to offer
more value than crispness and clarity.
Extensibility: Scribbling in the
margins. Extensibility is the addition
of stuff that was not specified in the
schema. By definition, it is data the
reader did not expect but the sender
wanted to add, anyway. This is much
like scribbling additional instructions
in the margins of a paper form that was
not designed for such additions. Sometimes the person reading the form will
When data is
contained inside a
database, it may
be normalized
and subjected
to DDL schema
transformations.
When data is
unlocked, it must
be immutable.
notice these additional instructions,
but sometimes not.
Stereotypes are in the eye of the beholder. A person dresses in a style usually intended to provide information
to strangers. When people began to
transition from living in small villages
(where everyone knew you and knew
what to expect from you) to living in
large cities (where most of the folks you
encounter are strangers), it became important to signal some information to
others by dressing in a particular way.
People dynamically adapt and
evolve their dress to identify their stereotype and community. Some groups
change quickly to maintain elitism
(for example, grunge); others change
slowly to encourage conformity (for example, bankers).
Dynamic and loose typing allows
for adaptability. Schema-less interop-erability is not as crisp and correct as
tightly defined schema. It presents
more opportunities for confusion.
When interpreting a message or
document, you must look for patterns
and infer the role of the data. This
works for humans when they examine a stranger’s stereotype and style.
It allows for flexibility for data sharing (which includes a cost for making
mistakes).
Sure and certain knowledge of a
person (or schema) has advantages. Scaling to an infinite number of
friends (or schemas) isn’t possible.
The emerging adaptive schemas for
data like stereotypes in people. While
you can learn a lot quickly (but not
perfectly), it scales to very large numbers of interactions.
Descriptive, not prescriptive schema. In very large and loosely coupled
systems, we see descriptive, not prescriptive schema:
˲ ˲ Descriptive schema. At the time the
data is written, the author describes
what is intended.
˲ ˲ Prescriptive schema. The data is
forced into a fixed format that is consistently shared by all authors.
The larger and more disconnected
the system, the more impractical it
is to maintain a prescriptive schema.
Over time, the attempt at consistency
becomes a fragility that breaks. Natural selection drives extremely large
systems away from consistency and
prescription.