cial pieces of cardboard wedged into
every nook and cranny of the vacuum.
Man, that vacuum is well protected! I
suspect they have a factory just to cre-
ate the specialized pieces of cardboard.
I also suspect the savings from avoiding
damage are well worth it.
XML grew out of the document
markup world. It descended from
SGML (Standard Generalized Markup
Language), which was originally in-
tended to separate the text of a docu-
ment from its formatting. XML is very
strongly oriented around letting you
“do your own thing” with the format.
Yet, on top of the flexible “do your own
thing” approach, there are mechanisms
to impose rigor and constraints on XML
documents. XML Schema came into be-
ing in the early 2000s as a means of en-
suring consistency for a set of messages.
A document is validated if it conforms to
an XML schema definition. In this way,
some usages of XML are constrained to
fit a particular shape and form.
One of the wonderful things about
XML and JSON is their flexibility. In
some applications, they support a
tightly prescribed schema much like
the cardboard protecting the vacuum
cleaner. In other applications, they al-
low you to toss in all your family goods,
including the kitchen sink. Sometimes,
there is a tightly prescribed schema of
required data while the sender can toss
in extensions to its heart’s content.
Crossing boundaries. In general,
semi-structured data is used to cross
boundaries in your computing en-
vironment. Documents containing
human-readable stuff are kept on web-
sites. REST calls are made across ser-
vices that may or may not reside within
the same company.
The loose coupling of semi-struc-
tured data allows the sending and re-
ceiving services to evolve separately
with much lower friction. Changing
tightly coupled stuff requires coordina-
tion that is just plain difficult.
Crossing boundaries with key-val-
ue stores. Frequently, semi-structured
data in documents or files is stored in
a file system or a key-value store. It is
valuable to have readers and writers
of these docs/files decoupled in their
metadata. To have the shape and form
of the data described in the contents of
the docs and files makes it possible to
evolve the various users with less fric-
tion than you would see if the metadata
were strict and rigid. This is why we see
the success of semi-structured repre-
sentations for stored stuff.
It’s not the size that counts. It turns
out the weight and size of the cardboard
are not that big of a deal. You have surely
had the experience of receiving some
small item such as a computer chip pack-
aged in a box that weighs a lot more than
the stuff being protected. It makes eco-
nomic sense to protect the tiny thing well.
Large e-commerce sites ship tens of
thousands of different things of dif-
ferent sizes. Still, they find it more ef-
ficient to use a relatively small number
of box sizes. Consequently, it’s com-
mon to open the box and find a tiny
thing and a whole bunch of padding.
Similarly, you shouldn’t be too
worried about the bulkiness of your
files and documents. The embedded
metadata can take a lot of space. Lord
knows, an XML file has a lot of an-
gle brackets! Still, the value accrued
from the features of semi-structured
data is worth it. As long as the world
doesn’t run out of angle brackets, it
will be all right.
Gotta take care of your stuff! In card-
board, the safety and care for stuff is
the important reason for its existence.
Similarly, in XML and JSON the safety
and care of the data, both in transit
and in storage, are why we bother.
Now, if only we could figure out efficient recycling for used angle brackets,
we would be good to go …
The Power of Babble
Rules for Mobile Performance Optimization
of Structured Data on the Web
R.V. Guha, Dan Brickley, and Steve Macbeth
Pat Helland has been implementing transaction systems,
databases, application platforms, distributed systems,
fault-tolerant systems, and messaging systems since
1978. He currently works at Salesforce.
Copyright held by author/owner.
Publications rights licensed to ACM. $15.00.