values for the variables T1, T2, T3, R1, R2,
R3, R4, R5 from σ and pk, then computes
the value c as c ← H (m, T1, T2, T3, R1, R2,
R3, R4, R5) where ← denotes assignment
from right to left, and H is a hash function; for the security proofs, BBS model
H is a random oracle. 6, 8 The signing algorithm then outputs
σ ← (T1, T2, T3, c, sα, sβ, sχ, sδ1, sδ7) , ( 1)
where sα, sβ, sχ, sδ1, sδ7 are functions of c,
ν, and sk[i]. The BBS Verify algorithm,
on input a group public key pk, a message m, and a signature σ = (T1, T2, T3, c,
sα, sβ, sχ, sδ1, sδ7), derives R′ 1, R′ 2, R′ 3, R′ 4,
R′ 5 from pk and σ, computes c′ as c′ ← H
(m, T1, T2, T3, R′ 1, R′ 2, R′ 3, R′ 4, R′ 5), accept-ing the signature as valid exactly when
c = c′. None of Clue’s optimizations or
extensions modify the BBS KeyGen or
Open algorithms; we therefore do not
survey their details here.
Optimizations. The following optimizations exploit artifacts of the BBS
scheme itself, as well as properties of
network protocols and clients; some
of these optimizations may be of independent interest.
Precomputation (for sender). Clue is
able to take advantage of client workloads to improve the overhead of Sign
in the sending critical path. The Sign
operation has two components, computing the Tj and Rj values, independent of packet data, using these values
to sign a packet. The Tj and Rj computation step by far dominates the overhead of Sign. If Clue takes the Tj and
Rj computation out of the critical sending path by precomputing them, Clue
can greatly improve the throughput
of using Sign. Most client workloads
consist of applications with low average sending rates (such as email, Web
browsing, and remote login), allowing
signature precomputation to overlap
I/O. Indeed, over long time scales, the
CPU load of clients and servers alike
is dominated by idle time—an effect
further magnified by multicore processors. Thus, periods of idleness can be
exploited to buffer signature precursors for subsequent periods of activity.
Windowed verification (for receiver).
Clue takes advantage of the streaming
nature of network protocols like TCP
to amortize verification over multiple
packets of data to reduce the overhead
of Verify in the receive critical path.
Deriving the R′j values from pk and σ
creates a significant fixed overhead for
Verify independent of the amount of
signed data. When using Verify on
the receiver, the attribution layer can
accumulate a window of packets (such
as a flight of TCP segments) and verify them all together to amortize per-packet verification overhead. We stress
that the signer signs every window of
k packets, even overlapping windows,
and that the verifier has the option of
either verifying the packets individually or verifying any window of its choosing. However, this verification optimization slightly increases the length of a
signature.
To accommodate this scenario, we
modify the BBS scheme as follows: Using our modified scheme, a verifier can
choose to verify the signature on the
j-th packet Pj in isolation (such as when
no other packets are waiting to be verified or when there is packet loss) or verify in batch the signature on a window
of k packets Pj−k+ 1, . . . , Pj. Clue achieves
this goal by, on the signing side, first
hashing the initial k − 1 packets Pj−k+ 1,
. . ., Pj− 1 to a value h, then signing h⏐⏐Pj
as before finally including h in the resulting signature tuple; here ⏐⏐ denotes
string concatenation, and the hash
function to compute h is H′ ≠ H, and Pj
is implicitly prefixed with a fixed-width
length field. To avoid trivial hash collisions in h, when hashing the packets
Pj−k+ 1, . . . , Pj− 1, Clue also prepends each
packet with a 4B length field, then concatenates the resulting length fields
and packets together. Including h in
the signature allows the receiver to verify the signature over the j-th packet Pj
in isolation (by verifying the signature
over h⏐⏐Pj). To verify the signature over
the entire window Pj−k+ 1, . . . , Pj, the receiver first recomputes h.
In the Clue prototype the window
size k is a parameter provided to the
IP layer. We modified our TCP implementation to adaptively set k to match
the sender’s congestion window. This
setting maximizes performance, as it
reflects the largest number of packets
that can be amortized together without
expecting a packet loss (losing the benefit of amortized verification).
Asynchronous verification (for re-
ceiver). The Clue prototype can also
overlap computation with network
delay to reduce protocol serialization
with verification; for example, rather
than wait until Verify completes on
a TCP packet before sending an ACK,
TCP can first optimistically send an
ACK back to the sender to overlap the
ACK with the Verify computation.
Implementing this feature is inher-
ently a layer violation since the Clue
prototype allows TCP ACK processing
to proceed independent of IP layer
verification, but Clue prevents unveri-
fied data packets from being passed to
the application.
σ ← ( T1, T2, T3, c, sα, sβ, sχ, sδ1, sδ7, R1,
R2, R3, R4, R5).
We then revise the Verify algorithm to, on input a signature σ, set c′′
← H(m, T1, T2, T3, R1, R2, R3, R4, R5), and
immediately reject if c′′ ≠ c. The modified verification algorithm would then
derive the variables R′ 1, R′ 2, R′ 3, R′ 4, R′ 5
from pk and T1, T2, T3, c, sα, sβ, sχ, sδ1, sδ7
in random order, immediately rejecting if R′j ≠ Rj. Finally, the modified algorithm would accept the signature as
valid, since failure to reject implies c =
H(m, T1, T2, T3, R′ 1, R′ 2, R′ 3, R′ 4, R′ 5).
Other potential optimizations. A large
class of related optimizations relax security guarantees in exchange for performance; for example, the receiver
could randomly verify a packet with