JULY 2019 | VOL. 62 | NO. 7 | COMMUNICATIONS OF THE ACM 89
to investigate the states visited and transitions between
them at runtime.
Figure 2a shows the QUIC state machine automatically
generated using traces from executing QUIC across all of
our experiment configurations. The diagram reveals behaviors that are common to standard TCP implementations,
such as connection start (Init, SlowStart), congestion
avoidance (CongestionAvoidance), and receiver-limited
connections (ApplicationLimited). QUIC also includes
states that are non-standard, such as a maximum sending
rate (CongestionAvoidanceMaxed), tail loss probes, and
proportional rate reduction during recovery.
Note that capturing the empirical state machine
requires instrumenting QUIC’s source code with log messages that capture transitions between states. In total, this
required adding 23 lines of code in five files. While the initial instrumentation required approximately 10 hours,
applying the instrumentation to subsequent QUIC versions required only about 30 minutes. To further demonstrate how our approach applies to other congestion
control implementations, we instrumented QUIC’s experimental BBR implementation and present its state transition diagram in our full paper. 14 This instrumentation
took approximately 5 hours. Thus, our experience shows
that our approach is able to adapt to evolving protocol versions and implementations with low additional effort.
We used inferred state machines for root cause analysis of performance issues. In later sections, we demonstrate how they helped us understand QUIC’s poor
performance on mobile devices and in the presence of
deep packet reordering.
Fairness. An essential property of transport-layer protocols is that they do not consume more than their fair share of
bottleneck bandwidth resources. Absent this property, an
unfair protocol may cause performance degradation for
competing flows. We evaluated whether this is the case for
the following scenarios, and present aggregate results over
10 runs in Table 1. We expect that QUIC and TCP should be
relatively fair to each other because they both use the Cubic
congestion control protocol. However, we find this is not the
case at all.
• QUIC vs. QUIC. We find that two QUIC flows are fair
to each other. We also found similar behavior for two
Chrome’s remote debugging protocol1 to load a page and
then extract HARs19 that include all resource timings and
the protocol used (which allows us to ensure that the correct
protocol was used for downloading an objectg).
In this section, we conduct extensive measurements and
analysis to understand and explain QUIC performance. We
begin by focusing on the protocol-layer behavior, QUIC’s
state machine, and its fairness to TCP. We then evaluate
QUIC’s application-layer performance, using PLT as example application metric.
4. 1. Calibration and instrumentation
In order to guarantee that our evaluation framework result
in sound comparisons between QUIC and TCP, and be able
to explain any performance differences we see, we (a) carefully configured our QUIC servers to match the performance
of Google’s production QUIC server and (b) compiled QUIC
client and server from source and instrumented them to
gain access to the inner workings of the protocol (e.g., congestion control states and window sizes). Prior work did no
such calibration and/or instrumentation, which explains
their reported poor QUIC performance in some scenarios,
and lack of root cause analysis. We refer the reader to our
full paper14 for detailed discussion on our calibration and
4. 2. State machine and fairness
In this section, we analyze high-level properties of the QUIC
protocol using our framework.
State machine. QUIC has only a draft formal specification and no state machine diagram or formal model;
however, the source code is made publicly available.
Absent such a model, we took an empirical approach and
used traces of QUIC execution to infer the state machine
to better understand the dynamics of QUIC and their
impact on performance.
Specifically, we use Synoptic7 for automatic
generation of QUIC state machine. While static analysis might
generate a more complete state machine, a complete
model is not necessary for understanding performance
changes. Rather, as we show in Section 4. 3, we only need
Figure 2. State transition diagram for QUIC’s Cubic CC.
Scenario Flow Avg. throughput (std. dev.)
QUIC vs. TCP QUIC 2. 71 (0.46)
TCP 1. 62 ( 1. 27)
QUIC vs. TCP× 2 QUIC 2. 8 ( 1. 16)
TCP 1 0.7 (0.21)
TCP 2 0.96 (0.3)
Table 1. Average throughput (5Mbps link, buffer = 30KB, averaged
over 10 runs) allocated to QUIC and TCP flows when competing with
each other. Despite the fact that both protocols use Cubic congestion
control, QUIC consumes nearly twice the bottleneck bandwidth than
TCP flows combined, resulting in substantial unfairness.