The Communications Web site, http://cacm.acm.org,
features 13 bloggers in the BLoG@cacm community.
in each issue of Communications, we’ll publish excerpts
from selected posts, plus readers’ comments.
DOI: 10.1145/1562164.1562169
http://cacm.acm.org/blogs/blog-cacm
Saying Good-bye to
DBmSs, Designing
effective interfaces
Michael Stonebraker discusses the problems with relational database
management systems and possible solutions, and Jason Hong writes
about interfaces and usable privacy and security.
from michael Stonebraker’s “The end of a DBmS era (might be upon us)” Rel at io na l dat a ba s e management systems (DBMSs) have been remarkably suc- cessful in capturing the DBMS market- place. To a first approximation they are “the only game in town,” and the major
vendors (IBM, Oracle, and Microsoft)
enjoy an overwhelming market share.
They are selling “one size fits all”; i.e.,
a single relational engine appropriate
for all DBMS needs. Moreover, the code
line from all of the major vendors is
quite elderly, in all cases dating from
the 1980s. Hence, the major vendors sell
software that is a quarter century old,
and has been extended and morphed
to meet today’s needs. In my opinion,
these legacy systems are at the end of
their useful life. They deserve to be sent
to the “home for tired software.”
Here’s why.
If we examine the nontrivial-sized
DBMS markets, it turns out that cur-
rent relational DBMSs can be beaten
by approximately a factor of 50 in most
any market I can think of. What follows
are a few examples.
In the data warehouse market, a
column store beats a row store by approximately a factor of 50 on typical
business intelligence queries. The
reason is because column stores read
only the columns of interest to the
query and not all of them. In addition,
compression is more effective in a column store. Since the legacy systems
are all row stores, they are vulnerable
to competition from the newer column stores.
In the online transaction processing (OLTP) market, a lightweight main
memory DBMS beats a row store by a
factor of 50. Leveraging main memory
and the fact that no DBMS application
will send a message to a human user
in the middle of a transaction allows
an OLTP DBMS to run transactions to
completion with no resource contention or locking overhead.
In the science DBMS market, us-
ers have never liked relational DBMSs
and want a non-relational model and
query facility. (This was the topic of my
last CACM blog, “DBMSs for Science
Applications: A Possible Solution.”)
If you are storing Resource Description Framework (RDF) data, which
is popular in the bio community and
elsewhere, then column stores are
very good at certain RDF workloads. In
addition, other ideas, such as RDF-3X,
will beat conventional DBMSs in other
situations. Lastly, native RDF engines
(e.g., Virtuoso, Sesame, and Jena) may
well gain traction. The point is that
something else will beat conventional
row stores in this market.
Text applications have never used
relational DBMSs. This was pointed
out to me most clearly by Eric Brewer
nearly 15 years ago in the early days of
Inktomi. He wanted to use a relational
DBMS to store the results of Web crawling, but found relational DBMSs to be
two orders of magnitude slower than
a home-brew system. All the major
Web-search engines use home-brew
text software to serve us search results.
None use relational DBMSs.
Even in XML, where the current major vendors have spent a great deal of
energy extending their engines, it is
claimed that specialized engines, such
as Mark Logic or Tamino, run circles
around the major vendors, according
to a private communication by Dave
Kellogg.
In summary, one can leverage at
least the following ideas to get superior
performance:
A non-relational data model. If the