to a symmetric path. This allows the
clock to be synchronized, but only up to
an unknown error E lying somewhere
in the range [–r, r]. This range can be
tens of milliseconds wide in some cases
and can dwarf other errors.
There is another key point to remember here. It is that any change in either
the true asymmetry (say because of a
routing change), or the estimate of it
used by the algorithm, makes the clock
jump. For example, based on some
knowledge of the routing between the
host and the server r = 15 ms away, an
intrepid administrator may replace the
default Â = 0 with a best guess of Â= 3
ms, resulting in a jump of 3/2 = 1. 5 ms.
Another example is a change in choice
of server, which inevitably brings with
it a change in asymmetry. The point is
that jumps, even if they result in improvements in synchronization, are an
evil unto themselves. Such asymmetry
jitter can confuse software not only in
this host, but also in others, since all
OWD’s measured to and from the host
will also undergo a jump.
To summarize, network synchronization consists of two very different
aspects. The synchronization algorithm’s role is to see through and eliminate delay variability. It is considered
to be accurate if it does this successfully even if the asymmetry error and,
therefore the final clock error, is large,
as it cannot do anything about this.
The asymmetry jitter problem is not
about variability but an unknown constant. This is so much simpler; however, it is inherently hard as it cannot
be circumvented, even in principle. So
here is the challenge: although it cannot be eliminated, the practical impact
of asymmetry depends strongly on how
it is managed. These two very different
problems cross paths in a key respect:
both benefit from nearby servers.
Robust algorithm Design
Here is a list of the key elements for reliable synchronization.
Don’t forget physics. The foundation
of the clock is the local hardware. Any
self-respecting algorithm should begin
by incorporating the essence of its behavior using a physically meaningful
model. Of particular importance are
the following simple characterizations
of its stability: the large-scale rate error bound (0.1 ppm) and the timescale
the hardware reality
is kept in direct
view rather than
seeing it through
where the oscillator variability is minimal (t = 1,000 seconds). These characteristics are remarkably stable across
PC architectures, but the algorithm
should nonetheless be insensitive to
their precise values.
The other fundamental physics
component is the nature of the delays.
A good general model for queuing delays, and hence OWD and RTT, is that
of a constant plus a positive random
noise. This constant (the minimum
value) is clearly seen in Figure 3 in the
case of RTT.
Use feed-forward. A feed-forward
approach offers far higher robustness,
which is essential in a noisy unpredictable environment such as the Internet.
It also allows a difference clock to be defined, which in turn is the key to robust
filtering for the absolute clock, as well
as being directly valuable for the majority of timing applications including
network measurement (OWDs aside).
The difference clock comes first.
Synchronizing the difference clock
equates to measuring the long-term
average period pav. This “rate synchronization” is more fundamental, more
important, and far easier than absolute
synchronization. A robust solution for
this is a firm foundation on which to
build the much trickier absolute synchronization.
Note that by rate synchronization we
mean a low noise estimate of average
long-term rate/period, not to be confused
with short-term rate, which reduces essentially to (the derivative of) drift, and
hence to absolute synchronization.
Use minimum RTT-based filtering. If
a timing packet is lucky enough to experience the minimum delay, then its
timestamps have not been corrupted
and can be used to set the absolute
clock directly to the right value (
asymmetry aside). The problem is, how can
we determine which packets get lucky?
Unfortunately, this is a chicken-and-egg problem, since to measure OWD,
we have to use the absolute clock Ca(t),
which is the one we want to synchronize in the first place. Luckily, the situation is different with RTTs, which can
be measured by the difference clock
Cd(t). The key consequence is that a
reliable measure of packet quality can
be obtained without the need to first
absolutely synchronize the clock. One
just measures by how much the RTT