not be trusted. Servers can and do have
their bad periods, and blind faith in
them can lead to deep trouble. Fortunately another authority is available:
the counter model. If the RTT filtering
is telling you that congestion is low,
yet the server timestamps are saying
you are suddenly way out, trust the
hardware and use the model to sanity-check the server. Basically, a drift well
over 1 ppm is not credible, and the algorithm should smell a rat.
When in doubt, just drift. What if congestion has become so high that none
of the available timestamps is of acceptable quality? What if I don’t trust
the server, or if I have lost contact with
There is only one thing to do: sit
back and relax. Nothing bad will happen unless the algorithm chooses to
make it happen. A reaction of inaction
is trivial to implement within the feed-forward paradigm and results in simply
allowing the counter to drift gracefully.
Remember the counter is highly stable,
accumulating only around 1 µs per second, at worst.
More generally, the algorithm
should be designed never to overreact
to anything. Remember, its view of the
world is always approximate and may
be wrong, so why try to be too clever
when inaction works so well? Unfortunately, feedback algorithms such
as ntpd have more reactive strategies
that drive the clock more strongly in
the direction of their opinions. This is
a major source of their nonrobustness
to disruptive events.
the Bigger Picture
Thus far we have considered the synchronization of a single host over the
network to a server, but what about the
system as a whole? In NTP, the main
system aspect is the server hierarchy.
In a nutshell, Stratum- 1 servers anchor
the tree as they use additional hardware
(for example, a PC with a GPS receiver
or a purpose-built synchronization box
with GPS) to synchronize locally, rather than over a network. By definition,
Stratum- 2 servers synchronize to Stratum- 1, Stratum- 3 to Stratum- 2, and so
on, and hosts synchronize to whatever
they can find (typically a Stratum- 2 or a
public Stratum- 1).
At the system level there are a number of important and outstanding chal-
Yes, the server
is the expert,
still, it should not
be trusted. servers
can and do have
their bad periods,
and blind faith in
them can lead
to deep trouble.
lenges. Stratum- 1 servers do not communicate among themselves, but act
(except for load balancing in limited
cases) as independent islands. There
is a limited capability to query a server
individually to obtain basic information such as whether it is connected
to its hardware and believes it is synchronized, and there is no ability to
query the set of servers as a whole. An
interconnected and asymmetry-aware
Stratum- 1 infrastructure could provide
a number of valuable services to clients. These include recommendations
about the most appropriate server for a
client, automatic provision of backup
servers taking asymmetry jitter into account, and validated information on
server quality. Currently no one is in
a position to point the finger at flaky
servers, but they are out there.
Building on the RADclock algorithm, the RADclock project3 aims to
address these issues and others as
part of a push to provide a robust new
system for network timing within two
years. Details for downloading the existing client and server software (
packages for FreeBSD and Linux), documentation, and publications can be found
on the RADclock project page.
The RADclock project is partially supported under Australian Research Council’s Discovery Projects funding scheme
(project number DP0985673), the Cisco
University Research Program Fund at
Silicon Valley Community Foundation,
and a Google Research Award.
1. endace measurement systems. dag series Pci and
Pci-X cards; http://www.endace.com/networkmcards.
2. mills, d.l. 2006. Computer Network Time
Synchronization: The Network Time Protocol. crc
Press, boca raton, fl, 2006.
3. ridoux, J., veitch, d. radclock Project Web page;
4. veitch, d., ridoux, J., korada, s.b. robust
synchronization of absolute and difference clocks over
networks. IEEE/ACM Transactions on Networking 17,
2 (2009), 417–430. doi: 10.1109/tnet.2008.926505.
Julien Ridoux is a research fellow at the center for
ultra-broadband information networks (cubin) in the
department of electrical & electronic engineering at the
university of melbourne, australia.
Darryl Veitch is a Principal research fellow at the center
for ultra-broadband information networks (cubin) in
the department of electrical & electronic engineering at
the university of melbourne, australia.