have long required practitioners comply with licensing and regulatory requirements, as well as strict auditing to
help assess damage in case of failure.
Similarly, aircraft (and, more recently,
car) manufacturers have been required
to include black boxes to collect data
for later investigation in case of malfunctions. Cloud providers have begun
wooing customers with enhanced security and compliance certifications,
underlining the increasing need for solutions for secure cloud computation.
5
The rest of this article focuses on a key
underpinning of the cloud—the virtualization platform—discussing some
of the technical challenges and recent
progress in achieving trustworthy hosting environments.
Meanwhile in Palo Alto...
Friday, 15: 47. Transmogrifica head-
quarters, Palo Alto…
An executive enters the boardroom
where Robin is already seated.
“Hello, Robin. I hear celebrations
are in order. How much time do we
have?”
“Hey Sasha, just who I was looking
for,” Robin says. “It’s going to be tight.
Andrea was just here, and we thought
we’d buy virtual machines in the cloud
to speed things up. Anything security
would be unhappy about?”
“Well...,” says Sasha, “it isn’t as se-
cure as in-house. We could be shar-
ing the system with anyone. Literally
anyone—who might love for us to fail.
Xanadu, for instance, which is sore
about not getting the contract? It’s un-
likely, but it could have nodes on the
same hosts we do and start attacking
us.”
“What would it be able to do?,” says
Robin.
“In theory, nothing. The hypervisor
is supposed to protect against all such
attacks. And these guys take their se-
curity seriously; they also have a good
record. Can’t think of anything off-
hand, but it’s frustrating how opaque
everything is. We barely know what
system it’s running or if it’s hardened
in any way. Also, we’re completely in
the dark about the rest of the provid-
er’s security process. Makes it really
difficult to recommend anything one
way or the other.”
“That’s annoying. Anything else I
need to know?”
“Nothing I can think of, though let
me think it through a bit more.”
Trusted Computing Base
The set of hardware and software
components a system’s security depends on is called the system’s trusted
computing base, or TCB. Proponents
of virtualization have argued for the
security of hypervisors through the
“small is secure” argument; hypervisors present a tiny attack surface
so must have few bugs and be secure.
23, 32, 33 Unfortunately, it ignores
the reality that TCB actually contains
not just the hypervisor but the entire
virtualization platform.
Note the subtle but crucial distinction between “hypervisor” and “
virtualization platform.” Architecturally,
hypervisors form the base of the virtualization platform, responsible for at
least providing CPU multiplexing and
memory isolation and management.
Virtualization platforms as a whole
also provide the other functionality
needed to host virtual machines, including device drivers to interface with
physical hardware, device emulation
to expose virtual devices to VMs, and
control toolstack to actuate and manage VMs. Some enterprise virtualization platforms (such as Hyper-V and
Xen) rely on a full-fledged commodity
OS running with special privileges for
the functionality, making both the hypervisor and the commodity OS part
of the TCB (see Figure 1). Other virtualization platforms, most notably KVM
and VMware ESXi, include all required
functionality within the hypervisor
itself. KVM is an extension to a full-fledged Linux installation, and ESXi is
a dedicated virtualization kernel that
includes device drivers. In each case,
this additional functionality means the
hypervisor is significantly larger than
the hypervisor component of either
Hyper-V of Xen. Regardless of the exact
architecture of the virtualization platform, it must be trusted in its entirety.
Figure 2 makes it clear that even the
smallest of the virtual platforms, ESXi,a
is comparable in size to a stock Linux
a In 2009, Microsoft released a stripped-down
version of Windows Server 2008 called Server
Core23 for virtualized deployments; while figures concerning its size are still not available
to us, we do not anticipate the virtualization
platform being significantly smaller than ESXi.
kernel (200K LOC vs. 300K LOC). Given
that Linux has seen several privilege-escalation exploits over the years, justifying the security of the virtualization
platform strictly as a function of the
size of the TCB fails to hold up.
A survey of existing attacks on virtualization platforms20, 27, 37, 38 reveals they,
like other large software systems, are
susceptible to exploits due to security
vulnerabilities; the sidebar “Anatomy
of an Attack” describes how an attacker can chain several existing vulnerabilities together into a privilege escalation exploit and bypass the isolation
between virtual machines provided by
the hypervisor.
Reduce Trusted Code?
One major concern with existing virtualization platforms is the size of the
TCB. Some systems reduce TCB size
by “de-privileging” the commodity OS
component; for example, driver-spe-cific domains14 host device drivers in
isolated virtual machines, removing
them from the TCB. Similarly, stub domains30 remove the device emulation
stack from the TCB. Other approaches
completely remove the commodity OS
from the system’s TCB,
10, 24 effectively
making the hypervisor the only persistently executing component of the
provider’s software stack a user needs
to trust. The system’s TCB becomes a
single, well-vetted component with significantly fewer moving parts.
Boot code is one of the most complex and privileged pieces of software.
Not only is it error prone it is also not
used for much processing once the system has booted. Many legacy devices
commodity OSes support (such as the
ISA bus and serial ports) are not relevant in multi-tenant deployments like
cloud computing. Modifying the de-vice-emulation stack to eliminate this
complex, privileged boot-time code25
once it has executed significantly reduces the size of the TCB, resulting in
a more trustworthy platform.
Prior to the 2006 introduction of
hardware support for virtualization,
all subsystems had to be virtualized
entirely through software. Virtualizing
the processor requires modification of
any hosted OS, either statically before
booting or dynamically through an on-the-fly process called “binary translation.” When a virtual machine is cre-