Figure 10. sQL arrow from child to parent.
C
P
Figure 11. cosQL arrow from parent to
child.
C
P
structure of rows; we assumed only
that we could determine a primary or
foreign key to relate two rows.
In the typical relational model we
would further impose the constraint
that rows consist of flat sequences of
scalar values (the so-called First Normal Form, or 1-NF). If we dualize relations in 1-NF, then we get a key-value
model where values consist of either
scalars or keys or collections of scalars
or keys. Surprisingly, this is precisely
the Amazon SimpleDB data model (see
Figure 12).
If we assume that rows in the for-
eign-/primary-key model are simply
blobs and keys are strings, then the
dual key-value model is exactly the
HTML5 key-value storage model:
interface Storage {
readonly attribute unsigned long
length;
getter DOMString key(in unsigned
long index);
getter any getItem(in DOMString
key);
setter creator void setItem(in DOMString key, in any data);
deleter void removeItem(in DOMString key);
void clear();
}
a Little more Category theory
So far we have discussed the basic data
models for SQL and coSQL, but we
have not yet touched upon queries. By
applying a little more category theory
we can show how a single abstraction,
monads and monad comprehensions,
can be used as a unified query language
for both SQL and coSQL.
To talk about queries, we need to
be more precise about what we mean
by collections of values. Pure relation-
al algebra is based on sets of rows,
but actual relational databases use
multisets (bags) or ordered multisets
(permutations). To model collections
abstractly, we look at sets/bags/per-
mutations of rows and apply the cat-
egory theory dictum: “What is the in-
terface that these various collections
of rows implement?” and “How do we
generalize queries based on such an
interface?”
First, let us stick with simple set col-
lections. When we write a SQL query
such as
SELECT F(a,b)
FROM as AS a, bs AS b
WHERE P(a,b)
the SQL compiler translates that pretty
syntax into a relational-algebra expression in terms of selection, projection,
joins, and Cartesian product. As is the
custom in the relational world, the
various operations are denoted using
Greek symbols:
πF(σP(as×bs))
Consequences of the duality between sQL and cosQL.
sQL
Children point to parents
Closed world
Entities have identity (extensional)
necessarily strongly typed
Synchronous (ACId) updates
across multiple rows
Environment coordinates changes
(transactions)
value-based, strong reference
(referentially consistent)
not compositional
Query optimizer
cosQL
Parents point to children
Open world
Environment determines identity (intensional)
Potentially dynamically typed
Asynchronous (BASE) updates within
single values
Entities responsible to react to changes
(eventually consistent)
Computation-based, weak reference
(expect 404)
Compositional
developer/pattern
Figure 12. amazon simpleDB representation of Products collection.
There is no reason to restrict the relational algebra operators to work over
just sets (or bags, or permutations) of
rows. Instead, we can define similar
operators on any collection M<T> of
values of arbitrary type T.
The interface for such abstract collections M<T> is a straightforward generalization of that of sets. It allows us
to create an empty collection using the
constant ∅; create a singleton collection of type M<T> given some value of
type T using the function {_} T→M<T>
(the notation T→M<T> denotes a func-tion/closure/lambda expression that
maps an argument value of type T to
a result collection of type M<T>); and
combine two collections into a larger
collection using the binary operator ∪
(depending on the commutativity and
idempotence of ∪, we obtain the various sorts of collections such as lists,
permutations, bags, and sets):
Title
The Right Stuff
Author
Tom Wolfe
Year
Book
Keyword
hardcover
American
Rating
****
4 stars
∅ Î M<T>
{_} Î T → M<T>
∪ Î M<T>×M<T> → M<T>
Using these constructors, we can
generalize the traditional relational