lenges scaling beyond a handful of
servers. Still, relational databases
have more than 30 years of investment
in applications, operations, and skills
development that will survive for
many years.
The emerging big-data stores as
represented by MapReduce and Hadoop offer complementary sets of advantages. Leveraging massive file systems with highly available replicated
data, these environments offer hundreds of petabytes of data that may be
addressed in a common namespace.
Recently, updatable key-value stores
have emerged that offer transaction-protected updates. 7 These enormous
systems are optimized for sharing with
multiple users accessing both computational and storage resources in a prioritized fashion.
Increasingly, these benefits of the
big-data environments will be applied
to copies of the relational database
data used within existing applications.
The integration of the line-of-business
relational data with the rest of the enterprise’s data will result in a common
backplane for data.
The line-of-business department in
an enterprise drives the development
of new applications. This is the department that needs the application and
the solutions it provides to meet a business requirement. The department
funds the application and, typically, is
not too concerned with how the application will fit into the rest of the enterprise’s computational work.
The IT department, on the other
hand, has to deal with and operate the
application once it is deployed. It wants
the application, database, and servers
to be on common ground. It needs to integrate the application into enterprise-wide monitoring and management.
This natural tension is similar to
that seen between property developers and the city planning commission.
Developers want to construct and sell
buildings and are not too fussy about
the quality. The city planners have
to ensure the developers consider issues such as neighborhood plans and
whether the sewage-treatment plant
has enough capacity.
As we move to cloud-computing en-
vironments in which the application,
database management system, and
other computing resources are hosted
on a common collection of servers, we
will see an increase in the standards
and expectations over how they will tie
to the enterprise.
Conclusion
The constraints in an environment are
what empower the services. The usage
pattern allows for supporting infrastructure and concierge services.
Shared buildings become successful by constraining and standardizing
their usage. Building designers know
how a building will be used, even if they
do not know who will be using it. Not
everyone can accept the constraints,
but for those who do, there are wonderful advantages and services.
The standardization of usage for
computational work will empower the
migration of work to the shared cloud.
With these usage patterns, supporting
services can dramatically lower the
barriers to developing and deploying
applications in the cloud. Lower-level
standards are emerging with VMs.
These support a broad range of applications with less flexibility for sharing. Higher-level PaaS solutions are
nascent but offer many advantages.
We must define and constrain the
usage models for important types of
cloud applications. This will permit en-
hanced sharing of resources with im-
portant supporting services. The new
PaaS offerings will bring tremendous
value to the computing world.
Related articles
on queue.acm.org
Commentary: A Trip Without a Roadmap
Peter Christy
http://queue.acm.org/detail.cfm?id=1515746
Fighting Physics: A Tough Battle
Jonathan M. Smith
http://queue.acm.org/detail.cfm?id=1530063
CTO Roundtable: Cloud Computing
January 10, 2009
http://queue.acm.org/detail.cfm?id=1551646
References
1. amazon Web services; http://aws.amazon.com/.
2. app Engine; http://code.google.com/appengine/.
3. armbrust, M., Fox, a., Griffith, R., joseph, a. d., Katz, R.
H., Konwinski, a., lee, G., Patterson, d. a., Rabkin, a.,
stoica, I. and Zaharia, M. above the clouds: a berkeley
view of cloud computing; http://www.eecs.berkeley.
edu/Pubs/techRpts/2009/EECs-2009-28.pdf.
4. deCandia, G., Hastorun, d., jampani, M., Kakulapati,
G., lakshman, a., Pilchin, a., sivasubramanian,
s., Vosshall,P. and Vogels, W. dynamo: amazon’s
highly available key-value store. ACM Symposium
on Operating Systems Principles; http://www.
allthingsdistributed.com/files/amazon-dynamo-
sosp2007.pdf.
5. Force.com; http://www.force.com/.
6. lakshman, a. and Malik, P. Cassandra—a
decentralized structured storage system. Large-scale
Distributed Systems and Middleware; http://www.
cs.cornell.edu/projects/ladis2009/papers/lakshman-
ladis2009.pdf.
7. Peng, d. and dabek, F. large-scale incremental
processing using distributed transactions and
notifications. In Proceedings of the 9th Usenix
Symposium on Operating Systems Design and
Implementation (2010); http://research.google.com/
pubs/ pub36726.html.
8. Riak; http://basho.com/products/riak-overview/
Pat helland has worked in distributed systems,
transaction processing, databases, and similar areas since
1978. He was the chief architect of tandem Computers’
tMF (transaction Monitoring Facility), which provided
distributed transactions for the nonstop system. at
Microsoft, he served as chief architect for Microsoft
transaction server, sQl service broker, and a number
of features within Cosmos, the distributed computation
and storage system underlying bing. He recently joined
salesforce.com and is leading a number of new efforts
working with very large-scale data.
this paper was written before Helland joined salesforce.com
and, while there are many similarities, this is not intended
to be a description of salesforce’s architecture.
© 2013 aCM 0001-0782/13/01