In the event of a link or switch failure,
the routing algorithm will take advantage of path diversity in the network to
find another path.
A path through the network is said
to be minimal if no shorter (that is,
fewer hops) path exists; of course,
there may be multiple minimal paths.
A fat-tree topology,
15 for example, has
multiple minimal paths between any
two hosts, but a butterfly topology9 has
only a single minimal path between
any two hosts. Sometimes selecting a
non-minimal path is advantageous—
for example, to avoid congestion or
to route around a fault. The length of
a non-minimal path can range from
min+ 1 up to the length of a Hamiltonian path visiting each switch exactly
once. In general, the routing algorithm might consider non-minimal
paths of a length that is one more than
a minimal path, since considering all
non-minimal paths would be prohibitively expensive.
network Performance
Here, we discuss the etiquette for
sharing the network resources—
specifically, the physical links and buffer spaces are resources that require
flow control to share them efficiently.
Flow control is carried out at different
levels of the network stack: data-link,
network, transport layer, and possibly
within the application itself for explicit coordination of resources. Flow
control that occurs at lower levels of
the communication stack is transparent to applications.
Flow control. Network-level flow
control dictates how the input buffers at each switch or NIC are managed: store-and-forward, virtual cut-through,
14 or wormhole,
19 for example.
To understand the performance implications of flow control better, you must
first understand the total delay, T, a
packet incurs:
T = H(tr + Ltp) + ts
H is the number of hops the packet
takes through the network; tr is the
fall-through latency of the switch,
measured from the time the first flit
(flow-control unit) arrives to when the
first flit exits; and tp is the propagation
delay through average cable length
L. For short links—say, fewer than 10
meters—electrical signaling is cost-effective. Longer links, however, require
fiber optics to communicate over the
longer distances. Signal propagation
in electrical signaling ( 5 nanoseconds
per meter) is faster than it is in fiber ( 6
nanoseconds per meter).
Propagation delay through electrical cables occurs at sublumenal
speeds because of a frequency-de-pendent component at the surface of
the conductor, or “skin effect,” in the
cable. This limits the signal velocity
to about three-quarters the speed of
light in a vacuum. Signal propagation
in optical fibers is even slower because
of dielectric waveguides used to alter the refractive index profile so that
higher-velocity components of the signal (such as shorter wavelengths) will
travel longer distances and arrive at
the same time as lower-velocity components, limiting the signal velocity to
about two-thirds the speed of light in
a vacuum. Optical signaling must also
account for the time necessary to perform electrical-to-optical signal conversion, and vice versa.
The average cable length, L, is largely determined by the topology and the
physical placement of system racks
within the data center. The packet’s serialization latency, ts, is the time necessary to squeeze the packet onto a narrow serial channel and is determined
by the bit rate of the channel. For example, a 1,500-byte Ethernet packet
(frame) requires more than 12µs (
ignoring any interframe gap time) to be
squeezed onto a 1Gb/s link. With store-and-forward flow control, as its name
suggests, a packet is buffered at each
hop before the switch does anything
with it:
Tsf = H(tr + Ltp + ts)
As a result, the serialization delay,
ts, is incurred at each hop, instead of
just at the destination endpoint as is
the case with virtual cut-through and
wormhole flow control. This can potentially add on the order of 100µs to
the round-trip network delay in a data-center network.
A stable network monotonically
delivers messages as shown by a char-
acteristic throughput-load curve in
Figure 4. In the absence of end-to-end
flow control, however, the network
can become unstable, as illustrated by
the dotted line in the figure, when the
offered load exceeds the saturation
point, α. The saturation point is the of-
fered load beyond which the network is
said to be congested. In response to this
congestion, packets may be discarded
to avoid overflowing an input buffer.
This lossy flow control is commonplace
in Ethernet networks.