CryptDB: Processing Queries
on an Encrypted Database
Theft of private information is a significant problem for
online applications. For example, a recent investigation
found that at least eight million people’s medical records
were stolen as a result of data breaches between 2009
and 2011, 13 and in a recent attack on the Sony Playstation
Network, attackers apparently gained access to about 77
million personal user profiles, some of which included
credit card information. 20 Such large-scale data thefts
make the popular press, but smaller-scale compromises
occur on a nearly daily basis, according to organizations
devoted to studying consumer and data privacy (e.g., the
Privacy Rights Clearinghouse).
Sensitive data can leak from online data repositories for a
variety of reasons: an adversary can exploit software vulnerabilities to gain unauthorized access to servers, 15 curious or
malicious administrators at a hosting provider can snoop on
private data, 3 and attackers with physical access to servers
can steal data from disk and memory. 11
One approach to reduce the damage caused by server
compromises is to encrypt all sensitive data stored on the
servers. However, many important applications, including database-backed Web services that process SQL queries, as well as analytic applications that compute results
over large quantities of data, require servers to not just
store data, but also perform computations on the data.
One solution could be to store the data encrypted at
the server, but to perform all computation at a trusted
client on plaintext by downloading and decrypting all
needed data for every computation; this approach, however, is usually untenable because there might be too
much data to move around, or because clients may have
significantly less computation or storage resources than
An ideal solution to satisfying the dual goals of protecting data confidentiality and running computations is to
enable a server to compute over encrypted data, without
the server ever decrypting the data to plaintext. The server
would produce results in an encrypted form, decryptable
only by a trusted client. This approach would preserve the
architecture of running much of the application’s computation at the server.
Theoretical approaches such as fully homomorphic
encryption7 enable the server to compute arbitrary functions
over encrypted data, while providing excellent confidentiality guarantees. But despite good progress in recent years,
these schemes remain many orders of magnitude slower
than equivalent plaintext computations (e.g., computing
the decryption circuit for AES—the Advanced Encryption
Standard6 is at least 109 times slower8).
We introduce CryptDB, a practical system that explores
an intermediate design point to provide confidentiality
for applications that use database management systems
(DBMSes). CryptDB is the first practical system that can
execute a wide range of SQL queries over encrypted data.
The key insight that makes our approach practical is that
most SQL queries use a small set of well-defined operators, each of which we are able to support efficiently over
CryptDB addresses two threats, as illustrated in
Figure 1. The first threat is an adversary who gains access to
the DBMS server and tries to learn private data (e.g., health
records, financial statements, and personal information)
by snooping on the server. This threat might arise when an
attacker exploits some vulnerability to directly get to the
DB server, when the database is outsourced to an external
organization (e.g., a public “cloud”), or when the DBMS is
administered by a curious system or database administrator (DBA) who might not be trusted. CryptDB aims to prevent the adversary from learning private data in this case.
The second threat is an adversary who gains complete control of the application and the DBMS servers. In this case,
CryptDB protects the confidentiality of the data belonging only to users logged-out of the application during an
attack, but cannot provide any guarantees for logged-in
users. This paper focuses primarily on the solution to the
first threat; our SOSP paper18 details the additional mechanisms that address the second threat.
CryptDB requires no changes to the internals of the
DBMS server, and should work with most standard SQL
DBMSes. Our implementation uses a MySQL back-end.
Our experiments show that the overhead of CryptDB is
modest: throughput reduces by only 26% for queries from
the standard TPC-C benchmark, and by only 14.5% for
a multiuser bulletin board application (phpBB), 18
compared to running them over MySQL without encryption.
We find that CryptDB supports most queries observed
in practice: an analysis of 126 million SQL queries from
an MIT MySQL service showed that CryptDB supports
operations over encrypted data for 99.5% of the 128,840
columns seen in the query trace.
2. thReat moDeL anD oVeRVie W
In this section, we discuss CryptDB’s threat model and provide an overview of our approach.
A previous version of this paper was published in the
Proceedings of the 23rd ACM Symposium on Operating
Systems Principles, October 2011.