hosts, except during the demonstration of bufferbloat.
The following examples demonstrate:
1. The ability to monitor changing
RTT accurately by modifying network
latency.
2. The impact of packet loss.
3. The impact of oversized buffers
(commonly referred to as bufferbloat).
Nagle’s algorithm. Before describing these experiments in detail, we
should take a look at Nagle’s algorithm,
6 which is enabled by default
in many TCP stacks. Its purpose is to
reduce the number of small, header-heavy datagrams transferred by the
network. It operates by delaying the
transmission of new data if the amount
of data available to send is less than the
MSS (maximum segment size), which
is the longest segment permissible given the maximum transmission unit on
the path, and if there is previously sent
data still awaiting acknowledgment.
Nagle’s algorithm can cause unnecessary delays for time-critical applications running over TCP. Thus,
because the assumption is that such
applications will run over TCP in the
experiments presented here, Nagle’s
algorithm is disabled. This is achieved
in the client and server by setting the
TCP_NODELAY socket option on all
sockets in use.
Experiment 1: Changing Network
Conditions. When computing RTTs, it
is critical the measurements accurately
reflect current conditions. The purpose
of this experiment is simply to demonstrate the responsiveness of our metrics to conditions that change in a predictable manner. In this experiment
the base RTT (100ms) is initially set,
and then an additional latency (50ms)
is alternately added and deducted from
that base RTT by incrementing the delay on both interfaces at the forwarding
host by 25ms. No loss ratio is specified
on the path, and no additional traffic is
sent between the two hosts.
Note that since TCP’s RTT calcu-
lation is wholly passive, it does not
observe variation in RTT if no data is
being exchanged. In the presence of
traffic, however, it’s beneficial that the
RTT measurement update quickly. The
results of this experiment are shown
in Figure 2. The measurements taken
at all layers indicate a bimodal dis-
tribution, which is precisely what we
should expect without other network
conditions affecting traffic. The three
forms of measurements taken are all
effectively equivalent, with the mean
RTT measured during the experiments
varying by no more than 1%.
Experiment 2: Packet Loss. Packet
loss on a network affects reliability, re-
sponsiveness, and throughput. It can
be caused by many factors, including
noisy links corrupting data, faulty for-
warding hardware, or transient glitch-
es during routing reconfiguration. As-
suming the network infrastructure is
not faulty and routing is stable, loss is
often caused by network congestion
when converging data flows cause a
bottleneck, forcing buffers to overflow
in forwarding hardware and, there-
fore, packets to be dropped. Loss can
happen on either the forward or the
reverse path of a TCP connection, the
only indication to the TCP stack being
the absence of a received ACK.
TCP offers applications an ordered
bytestream. Thus, when loss occurs
and a segment has to be retransmit-
ted, segments that have already arrived
but that appear later in the bytestream
must await delivery of the missing seg-
ment so the bytestream can be reas-
sembled in order. Known as head-of-
line blocking, this can be detrimental
to the performance of applications
running over TCP, especially if latency
is high. Selective acknowledgments, if
enabled, allow a host to indicate pre-
cisely which subset of segments went
missing on the forward path and thus
which subset to retransmit. This helps
improve the number of segments “in
flight” when loss has occurred.
In this experiment, packet loss was
enabled on the forwarding host at loss
rates of 5%, 10%, 15%, and 20%, the purpose being to demonstrate that TCP
segments are still exchanged and RTTs
estimated by TCP are more tolerant to
the loss than the RTTs measured by the
application. The results of this experiment are shown in Figure 3. The points
represent median values, with 5th and
95th percentiles shown.
In these tests, a 5% packet loss was
capable of introducing a half-second
delay for the application, even though
the median value is close to the real RT T
of 100ms; the mean measured application layer RTT with 5% loss is 196.4ms,
92.4ms higher than the measured mean
for TCP RTT. The measured means rise
quickly: 400.3ms for 10% loss, 1.2s for
15% loss, and 17.7s for 20% loss. The
median values shown in Figure 3 for
application-layer RTT follow a similar
pattern, and in this example manifest
in median application-layer RTTs measured at around 12 seconds with 20%
packet loss. The TCP RTT, however, is
always close to the true 100ms distance;
although delayed packet exchanges can
inflate this measure, the largest mean
deviation observed in these tests between TCP RTT and ICMP RTT was a
HIGH-FREQUENCY
TRADING
100000
10000
1000
100
5
10
15
20
figure 3. Rtts measured in the presence of varying packet loss rates.
tCP rtt
application rtt
Rt
t
(m
s, l
o
gs
ca
le)
Loss (%)