a result of processing speed; queuing
delays in nodes (hosts and network
routers and switches); transmission
delay resulting from the bit-rate of
transmission; and propagation delays
caused by physical distances. When
one or more of those delays becomes
large, interactivity (
application-to-application message delivery) will suffer. Of these four latency components,
queuing delay (inside TCP send buffers
and network node queues) is the dominant cause of latency for high-bandwidth TCP applications. This is known
as the bufferbloat problem. 7
Processing delay is generally negligible because of fast CPUs and careful
design of transport algorithms. Transmission delay will be bounded to
delaytransmit = mss/link _ rate
assuming for the moment that ADUs
(application data units) fit within transport segments up to an mss (maximum
segment size). With common values of
link _ rate (Mbps or Gbps) and mss
(for example, 1,500 byte), delaytransmit
will be a small value (for example, sub-millisecond). This leaves propagation
delay and queuing delays as the dominant contributors to latency.
One-way propagation delay has
lower bounds set by the laws of physics. Typical Internet path RTT (
round-trip-time) values are in tens of milliseconds for intra-continental distances,
or around 100 or 200 milliseconds for
distances that cross oceans or traverse
figure 1. components of end-to-end latency.
host
Router
host
Link
network
Link
node or Router
Processing
Link
Propagation
satellites. In addition, TCP provides
reliability via retransmissions that can
add extra queuing delay (multiples of
the propagation delay) to the total. In
the common case, however, TCP’s fast
retransmit mechanism should limit
the retransmission-induced queuing
delay to an RTT or two.
More importantly, TCP’s socket
buffer is often large enough that it can
cause queuing delays in seconds. In
many realistic conditions, the queuing delay specifically caused by the
sender-side TCP socket buffer is the
dominant portion of the total delay
because of large kernel socket buffers
employed by TCP implementations.
For example, with a typical TCP send
buffer size of 64KB, and a 300Kbps
video stream, a full send buffer contributes 1,700ms of delay. To avoid
unnecessary queuing delays, the kernel can be changed to dynamically
tune the socket buffer size, bringing
the end-to-end delay within two RTTs
most of the time, while leaving TCP’s
congestion control unchanged. 8
Paceline builds upon this idea, but
is designed to avoid the need for kernel
modifications. A user-level approach
avoids the deployment obstacles of introducing new TCP implementations,
deals gracefully with transparent proxies that can defeat an in-TCP-based approach, and allows a failover mechanism to reduce the worst-case latency
when TCP becomes stalled in the case
of back-to-back losses and retransmission timeouts.
Queuing
transmission
figure 2. streams of application data units.
Web-based Game Streams
1. Web Page Download Stream
2. Chat Stream
Streams
1. 1 Image
1. 2 Image
1. 3 Script
2. 1 Video
2. 2 Audio
Messages
1. 1. 1 1. 1. 2
1. 2. 2
1. 3. 1 1. 3. 2 1. 3. 3
2. 1. 1 2. 1. 2
Chunks 2. 2. 1
Data service model:
not all Data is born equal
In diverse environments, demands
often exceed available bandwidth,
leading to large sender-side queues.
Queues can introduce head-of-line
blocking (a delay that occurs when a
line of packets is held up by the first
packet) and hinder perceived quality
if all the data items are treated equally
and processed in a FIFO (first-in first-out) order. Minimizing the amount
of data committed to TCP socket buffers reduces TCP sender-side queuing delays and pushes the sender-side
buffers up the stack to the layer above
TCP (Paceline in our design). Here, we
describe the transport service model
in Paceline with the necessary quality adaptation mechanisms to manage