marginally sensitive (e.g., timestamps and counts of messages). This data demonstrates the importance of CryptDB’s
adjustable security: it provides a significant improvement
in confidentiality over revealing all encryption schemes to
For the sql.mit.edu trace, approximately 6.6% of the columns were at OPE even with in-proxy processing; the other
encrypted columns remain at DET or above. Out of the
columns that were at OPE, ~60% are used in an ORDER BY
clause with a LIMIT, ~55% are used in an order comparison in a WHERE clause, and ~4% are used in a MIN or MAX
aggregate operator (some of the columns are counted in
more than one of these groups). It would be difficult to perform these computations in the proxy without substantially
increasing the amount of data sent to it.
5. 3. Performance evaluation
To evaluate the performance of CryptDB, we used a machine
with two 2.4GHz Intel Xeon E5620 4-core processors and
12 GB of RAM to run the MySQL 5. 1. 54 server, and a machine
with eight 2.4GHz AMD Opteron 8431 6-core processors
and 64 GB of RAM to run the CryptDB proxy and the clients.
The two machines were connected over a shared Gigabit
Ethernet network. The higher-provisioned client machine
ensures that the clients are not the bottleneck in any experiment. All workloads fit in the server’s RAM.
5. 3. 1. TPC-C
We compare the performance of a TPC-C query mix when
running on an unmodified MySQL server versus on a
CryptDB proxy in front of the MySQL server. We warmed up
CryptDB on the query set so that there are no onion adjustments during the TPC-C experiments. The server spends
100% of its CPU time processing queries.
We consider two important metrics: database server
throughput (number of queries per second that the server
can process) and latency (time interval from when the
application issues a query to when it receives the result).
Figure 5. throughput of different types of SQL queries from the
tPC-C query mix running under mySQL and CryptDB. “upd. inc”
stands for UPDATE that increments a column, and “upd. set” stands
for UPDATE that sets columns to a constant.
Queries / sec
The throughput with CryptDB was 26% lower than that
with plain MySQL on TPC-C. We believe this overhead is
modest considering the gains in confidentiality. To understand the sources of CryptDB’s overhead, we measure the
server throughput for different types of SQL queries seen
in TPC-C, on the same server, but running with only one
core enabled. Figure 5 shows the results for MySQL and
CryptDB. The results show that CryptDB’s throughput penalty is the greatest for queries that involve a SUM (half the
throughput) and for incrementing UPDATE statements
( 1. 6 × less throughput); these are the queries that involve
HOM additions at the server. For the other types of queries,
which form a larger part of the TPC-C mix, the throughput
penalty is lower.
To understand the latency introduced by CryptDB, we
measure the server and proxy processing times for the
same types of SQL queries as above. The server latency is
0.12 ms, which is a 20% increase over the 0.10 ms latency
of plain MySQL, which we consider to be small. The proxy
adds an average of 0.60 ms to a query; of that time, 24% is
spent on mysql-proxy, 23% is spent on encryption and
decryption, and the remaining 53% is spent parsing and
processing queries. The cryptographic overhead is relatively small because most of our encryption schemes are
efficient. OPE and HOM are the slowest, but we performed
two optimizations: pre-computing randomness to speed
up encryption for HOM, and caching ciphertexts for OPE.
Without these optimizations, the proxy latency would have
been 10. 7 ms on average in our experiments, which is significantly higher.
5. 3. 2. phpBB
We also evaluated the performance of CryptDB on phpBB,
an open-source Web forum application. We measured the
HTTP request processing throughput of a phpBB server
using both CryptDB and a standard MySQL database. We
encrypted only the sensitive fields as shown in Figure 4. We
found that CryptDB reduced throughput by only 14.5%.
6. ReLateD WoRK
Search and queries over encrypted data. Cryptographic tools
for performing keyword search over encrypted data have
been proposed (e.g., Song et al. 22 which we use to implement
SEARCH). When applied to processing SQL on encrypted
data, these techniques suffer from some of the following limitations: certain basic queries are not supported or
are too inefficient (especially joins and order checks), they
require significant client-side query processing, users either
have to build and maintain indexes on the data at the server
or have to perform sequential scans for every selection/
search, and implementing these techniques requires unattractive changes to the innards of the DBMS.
Some researchers have developed prototype systems for
subsets of SQL, but they achieve lower security, require a significant DBMS rewrite, and rely on client-side processing. For
example, Hac gümüş et al. 10 heuristically split the domain of
possible values for each column into partitions, storing the
partition number unencrypted for each data item, and rely
on extensive client-side filtering of query results.