must compute the polynomial h(x) such that t(x) × h(x) = P(x)
(Section 1). Since we store P(x) in terms of its evaluations
at the roots of the quadratic program (recall Figure 2), the
worker must first interpolate to find P(x) and then perform a
polynomial division to arrive at h(x).
Note that all of these computations take place in a normal field, whereas all of the worker’s other steps involve
cryptographic operations, which are about three orders of
magnitude more expensive.
Thus, one might naïvely conclude, as we did, that simple polynomial algorithms, such as Lagrangian interpolation and “high-school” polynomial multiplication, suffice.
However, we quickly discovered that the O(n2) behavior of
these algorithms, at the scale required for verifiable computing, dwarfed the linear number of cryptographic operations (Section 5). Hence we implemented an FFT-based
O(nlogn) polynomial multiplication library and used a
polynomial interpolation algorithm that builds a binary
tree of polynomials, giving total time O(nlog2n). Even so
optimized, solving for h(x) is the second largest source of
Preparing for the Future; Learning from the Past. In our
implementation and evaluation, we assume a worst case scenario in which the client decides, without any warning, to
outsource a new function, and similarly that the worker only
ever computes a single instance for a given client. In practice,
neither scenario is plausible. When the client first installs
Pinocchio, the program, could build the single base exponent table discussed earlier. Further, it can choose a random s and begins computing powers of s in the background,
since these are entirely independent of the computation.
The worker can optimize similarly, given the client’s key.
4. 3. Applications
Pinocchio runs several applications; each can be instantiated with some static parameters, and then each instance
can be executed with dynamic inputs. While it may be possible to use custom verification checks for some of these
applications (e.g., matrix multiplication), we include them
to illustrate their performance within a general-purpose system like Pinocchio.
Fixed Matrix multiplies an n × n matrix parameter M by an
n-length input vector A, and outputs the resulting n-length
vector M × A. We choose five parameter settings that range
from |M| = 200 × 200 to |M| = 1000 × 1000.
Two Matrices has parameter n, takes as input two n × n
matrices M1 and M2, and outputs the n × n matrix M1 × M2.
Matrix operations are widely used, for example, in collaborative filtering (|M| = 30 × 30 to |M| = 110 × 110).
MultiVar Poly evaluates a k-variable, m-degree multivariate polynomial. The (m + 1)k coefficients are parameters, the
k variables x1, . .., xk are the inputs, and the polynomial’s
scalar value is the output (k = 5, m = 6, 16,807 coeff. to k = 5,
m = 10; 644,170 coeff.).
Image Matching is parameterized by an iw × ih rectangular
image and parameters kw, kh. It takes as input a kw × kh image
kernel, and outputs the minimum difference and the point
(x, y) in the image where it occurs (iw × ih = 25, kw × kh = 9 to
iw × ih = 2025, kw × kh = 9).
Shortest Paths implements the Floyd- Warshall O(n3) graph
algorithm, useful for network routing and matrix inversion.
Its parameter n specifies the number of vertices, its input is
an n × n edge matrix, and its output is an n × n matrix of all-pairs
shortest paths (n = 8, e = 64 to n = 24, e = 576).
LGCA is a Lattice-Gas Cellular Automata implementation
that converges to Navier-Stokes. It has parameter n, the fluid
lattice size, and k, the iteration count. It inputs one n-cell lattice and outputs another reflecting k steps (n = 294, k = 5 to
n = 294, k = 40).
SHA- 1 has no parameters. Its input is a 13-word (416-bit)
input string, and it outputs its 5-word (160-bit) SHA- 1 hash.
We experiment on a Lenovo X201 ThinkPad. We run on a
single core of a 2. 67 GHz Intel Core i7 with 8 GB of RAM.
Below, we focus on comparisons with previous work
and app-level performance. In the full paper, we present microbenchmarks to quantify the basic cost units
of our protocol. Our results show that the optimizations
described in Section 4. 2. 1 reduce costs by 2–3 orders of
magnitude for polynomial operations, and factors of 3–10
for exponentiations. At the macro level, relative to the original GGPR protocol, KeyGen and Compute are more than
twice as fast, and even verification is 24% faster. Pinocchio
also drastically reduces the size of the evaluation key and
even manages to reduce the size of GGPR’s already svelte 9
element proof to 8 elements.
5. 1. Comparison with related work
Figure 4 plots Pinocchio’s performance against that of
related systems. We use the multiplication of two matrices
as our test application since it has appeared in several prior
papers, though simpler, non-cryptographic verification
procedures exist. Since all of these prior schemes are designated verifier, we measure against Pinocchio’s designated
We compare against ( 1) a naïve version of a PCP-based
scheme22; ( 2) GGP, 9 an early scheme that defined VC, but
which relies on FHE; ( 3) Pepper, 22 an optimized refinement of ( 1); ( 4) Ginger, 23 a further refinement of Pepper;
( 5) Ginger with a batch of one million simultaneous
Figure 4. Performance Relative to Related Schemes. Pinocchio
reduces costs by orders of magnitude (note the log scale on the
y-axis). We graph the time necessary to (a) verify and (b) produce a
proof result for multiplying two N × N matrices.
Matrix dimension (N x N)
75 100 25 50
Matrix dimension (N x N)
(b) Worker Latency (a) Per-Instance Verification Latency