By David Chiu
Take a second to consider all the essential services and utilities we consume and pay for on a usage basis: water, gas, electricity. In the distant past, some people have suggested that computing be treated under the same model as most other utility providers.
The case could certainly be made. For instance, a company that supports its own computing infra-
structure may suffer from the costs of equipment, labor, maintenance, and mounting energy bills. It
would be more cost-effective if the company paid some third-party provider for its storage and pro-
cessing requirements based on time and usage.
While it made perfect sense from the client’s perspective, the overhead of becoming a computing-
as-a-utility provider was prohibitive until recently. Through advancements in virtualization and the
ability to leverage existing supercomputing capacities, utility computing is finally becoming realized.
Known to most as cloud computing, leaders, such as Amazon Elastic Compute Cloud (EC2), Azure,
Cloudera, and Google’s App Engine, have already begun offering utility computing to the mainstream.
A simple, but interesting property in utility models is elasticity, that is, the ability to stretch and
contract services directly according to the consumer’s needs.
Elasticity has become an essential expectation of all utility providers. When’s the last time you
plugged in a toaster oven and worried about it not working because the power company might have
run out of power? Sure, it’s one more device that sucks up power, but you’re willing to eat the cost.
Likewise, if you switch to using a more efficient refrigerator, you would expect the provider to charge
you less on your next billing cycle.
What elasticity means to cloud users is that they should design their applications to scale their
resource requirements up and down whenever possible. However, this is not as easy as plugging or
unplugging a toaster oven.
A Departure from Fixed Provisioning
Consider an imaginary application provided by my university, Ohio State. Over the period of a day,
this application requires 100 servers during peak time, but only a small fraction of that during down
time. Without elasticity, Ohio State has two options: either provision a fixed amount of 100 servers,
or less than 100 servers.
While the former case, known as over-provisioning, is capable of handling peak loads, it also wastes
servers during down time. The latter case of under-provisioning might address, to some extent, the pres-
ence of idle machines. However, its inability to handle peak loads may cause users to leave its service.
By designing our applications to scale servers accordingly to the load, the cloud offers a depar-
ture from the fixed provisioning scheme.
To provide an elastic model of computing, providers must be able to support the sense of having
an unlimited number of resources. Because computing resources are unequivocally finite, is elas-
ticity a reality?
In the past several years, we have experienced a new trend in processor development. CPUs are now
being shipped with multi- and many-cores on each chip in an effort to continue the speed-up, as
predicted by Moore’s Law. However, the superfluous cores (even a single core) are underutilized or
left completely idle.
Spring 2010/ Vol. 16, No. 3