practice
Doi: 10.1145/1735223.1735241
article development led by
queue.acm.org
The key to synchronizing clocks over
networks is taming delay variability.
BY JuLien RiDoux anD DaRRYL Veitch
Principles
of Robust
timing over
the internet
eVerYoNe, aNd Mos T everything, needs a clock, and
computers are no exception. however, clocks tend to
ultimately drift off, so it is necessary to bring them to
heel periodically through synchronizing to some other
reference clock of higher accuracy. An inexpensive and
convenient way to do this is over a computer network.
Since the early days of the Internet, a system
collectively known as n TP (network Time Protocol)
has been used to allow client computers, such as PCs,
to connect to other computers (nTP servers) that have
high-quality clocks installed in them. Through an
exchange of packet timestamps transported in nTP-formatted packets over the network, the PC can use
the server clock to correct its own clock.
As the NTP clock software, in particular
the ntpd daemon, comes packaged
with all major computer operating
systems, including Mac OS, Windows,
and Linux, it is a remarkably successful
technology with a user base on the order of the global computer population.
Although the NTP system has operated well for general-purpose use for
many years, both its accuracy and robustness are below what is achievable
given the underlying hardware, and are
inadequate for future challenges. One
area where this is true is the telecommunications industry, which is busy
replacing mobile base-station synchronous backhaul systems (which used to
provide sub-microsecond hardware-based synchronization as a by-product) with inexpensive asynchronous
Ethernet lines. Another is high-speed
trading in the finance industry, where
a direct relationship exists between reducing latencies between exchanges
and trading centers, and the ability to
exploit these for profit. Here accurate
transaction timestamps are crucial.
More generally, timing is of fundamental importance because, since the
speed of light is finite, the latencies
between network nodes are subject to
hard constraints that will not be defeated by tomorrow’s faster processors
or bandwidth increases. What cannot
be circumvented must be tightly managed, and this is impossible without
precise synchronization.
the clockwork
When the discussion turns to clocks,
confusion is often not far behind. To
avoid becoming lost in the clockwork,
let’s define some terms. By t we mean
true time measured in seconds in a
Newtonian universe, with the origin at
some arbitrary time point. We say that
a clock C reads C(t) at true time t. Figure 1 shows what some example clocks
read as (true) time goes on. The black
clock Cp(t) is perfect: Cp(t) = t, whereas
the blue clock Cs(t) = C0 + ( 1 + x)t is out
by C0 when t = 0. In fact it keeps getting
worse as it runs at a constant but overly