NOVEMBER2018 | VOL. 61 | NO. 11 | COMMUNICATIONS OF THE ACM 151
contains a pre-generation feature that maintains a pool of
nonces and DH keys that can be used in new IKE connections, reducing handshake latency. The pooling mechanism
is quite intricate and appears to be designed to ensure that
enough keys are always available while avoiding consuming
too much run time on the device.
Independent First In, First Out (FIFO) queues are maintained for nonces, for each supported finite field DH group
(MODP 768, MODP 1024, MODP 1536, and MODP 2048), and
(in version 6. 3) for each supported elliptic curve group
(ECP 256 and ECP 384). The sizes of these queues depend
on the number of VPN configurations that have been
enabled for any given group. For instance, if a single configuration is enabled for a group then that group will have
queue size of 2. The size of the nonce queue is set to be
twice the aggregate size of all of the DH queues. At startup,
the system fills all queues to capacity. A background task
that runs once per second adds one entry to a queue that is
not full. If a nonce or a DH share is ever needed when the
corresponding queue is empty, a fresh value is generated
on the fly.
The queues are filled in priority order. Crucially, the
nonce queue is assigned the highest priority; it is followed by the groups in descending order of cryptographic
strength (ECP 384 down to MODP 768). This means that in
many (but not all) cases, the nonce for an IKE handshake
will have been drawn from the Dual EC output stream
earlier than the DH share for that handshake, making single-connection attacks feasible.
Figure 1 shows a (somewhat idealized) sequence of
generated values, with the numbers denoting the order
in which queue entries were generated, before and after
an IKE Phase 1 exchange. Figure 1a shows the situation
after startup: The first four values are used to fill the nonce
queue and the next two values are used to generate the DH
shares. Thus, when the exchange happens, it uses value 1
for the nonce and value 5 for the key, allowing the attacker
to derive the Dual EC state from value 1 and then compute
forward to find the DH share. After the Phase 1 exchange,
which consumes a DH share and a nonce, and after execution of the periodic, queue-refill task, the state is as shown
in Figure 1b, with the new values shaded.
Depending on configuration, the IKE Phase 2 exchange
would consume either a nonce and a DH share or just a
nonce. If the exchange uses both a nonce and a DH share,
However, while this is straightforward in principle, there
are a number of practical complexities and potential imple-
mentation decisions which could make this attack easier or
more difficult (or even impractical) as described below.
4. 2. Nonce size
For Dual EC state reconstruction to be possible, the attacker
needs more than just to see raw Dual EC output. She needs at
least 26B of the x-coordinate of a single elliptic-curve point
to recover the Dual EC state; fewer bytes would be insufficient (Section 2).
Luckily for the attacker, the first 30B of the 32B returned
by ScreenOS’s Dual EC implementation belong to the
x-coordinate of a single point, as we sa w in Section 2. Luckily
again for the attacker, ScreenOS’s PRNG subsystem also
returns 32B when called, and these are the 32B returned
by a Dual EC invocation, as we saw in Section 3. Finally,
IKE nonces emitted by ScreenOS are 32B long and produced from a single PRNG invocation. To summarize: In
ScreenOS 6. 2 and 6. 3, IKE nonces always consist of 30B of
one point’s x-coordinate and 2B of the next point’s x
-coordinate—the best-case scenario for Shumow–Ferguson
It is worth expanding on this point. The IKE standards
allow any nonce length between 8 and 256B (Section 5;
Ref. 7). An Internet-wide scan of IKE responders by Adrian et
al. 3 found that a majority use 20B nonces. We are not aware
of any cryptographic advantage to nonces longer than 20B.
ScreenOS 6. 1 sent 20B nonces and, as we noted in Section
3, its PRNG subsystem generated 20B per invocation. In
ScreenOS 6. 2, Juniper introduced Dual EC, rewrote the
PRNG subsystem to produce 32B at a time, and modified
the IKE subsystem to send 32B nonces.
4. 3. NONCES AND DH KEYS
An attacker who knows the d corresponding to Juniper’s
point Q and observes an IKE nonce generated by a
ScreenOS device can recompute the device’s Dual EC state
at nonce generation time. She can roll that state forward
to predict subsequent PRNG outputs, though not back to
recover earlier outputs. ScreenOS uses its PRNG to generate IKE Diffie–Hellman shares, so the attacker will be able
to predict DH private keys generated after the nonce she
saw and compute the session keys for the VPN connections established using those IKE handshakes.
This scenario is clearly applicable when the attacker has
a network tap close to the ScreenOS device, and can observe
many IKE handshakes. But what if the attacker’s network
tap is close to the VPN client instead? She might observe
only a single VPN connection. If the nonce for a connection
is generated after the DH share, the attacker will not be able
to recover that session’s keys.
A superficial reading of the ScreenOS IKE code seems to
rule out single-connection attacks: The KE payload containing the DH share is indeed encoded before the Nr payload
containing the nonce.
Conveniently for the attacker, however, ScreenOS also
Figure 1. Nonce queue behavior during an IKE handshake. Numbers
denote generation order, and values generated after the handshake
are shaded. During a DH exchange, outputs 1 and 5 are used as the
nonce and key, advancing the queue, and new outputs are generated
to fill the end of the queue.
Nonces 1 2 3 4
1024 5 6
(a) At system startup.
Nonces 2 3 4 7
1024 6 8
(b) After a DH exchange.