figure 2: Packets in flight between a sender and a receiver.
sender
Receiver
figure 3: TCP/iP attempts to discover the available network capacity.
bandwidth-delay capacities shows how
a wide range of latencies can be accommodated. For distributed applications,
this might be accomplished by dynamically relocating elements of a system
(for example, via process migration or
remote evaluation).
None of these suggestions will allow you to overcome physics, although
prefetching in the best of circumstances might provide this illusion. With
careful design, however, responsive
distributed applications can be architected and implemented to operate
over long distances.
Window
help software developers adapt to the
laws of physics.
Bandwidth helps latency, but not
propagation delay. If a distributed application can move fewer, larger messages, this can help the application as
the total cost in delay is reduced since
fewer round-trip delays are introduced.
The effects of bandwidth are quickly
lost for large distances and small data
objects. Noise can also be a big issue
for increasingly more common wireless links, where shorter packets suffer a lower per-packet risk of bit errors.
The lesson for the application software
designer is to think carefully about a
design’s assumptions about latency.
Assume large latencies, make it work
under those circumstances, and take
advantage of lower latencies when they
are available. For example, use a Web-embedded caching scheme to ensure
the application is responsive when latencies are long, but no cache when it’s
not necessary.
Spend available resources (such as
throughput and storage capacity) to save
precious ones, such as response time.
This may be the most important of
these rules. An example is the use of
caches, including preemptive caching
of data. In principle, caches can be replicated locally to applications, causing
Bottleneck
Bandwidth
Time
some cost in storage and throughput
(to maintain the cache) to be incurred.
In practice, this is almost always a
good bet when replicas can be made,
because growth in storage capacities
and network throughputs appears to
be increasing at a steady exponential
rate. Prefilling the cache with data likely to be used means that some capacity
will be wasted (what is fetched but not
needed) but that the effects of some delays will be mitigated when predictions
of what is needed are good.
Think relentlessly about the architecture of the distributed application. One
key observation is that a distributed
system can be distributed based on
function. To return to the design of a
system with a live data store (such as
a stock market), we might place the
program trading of stocks near the
relevant exchanges, while placing the
user interaction functionality, account
management, compliance logging, etc.
remotely in less exchange-local real estate. Part of such a functional decomposition exercise is identifying where
latency makes a difference and where
the delay must be addressed directly
rather than via caching techniques.
Where possible adapt to varying
latencies. The example of protocols
maximizing throughput by adapting to
summary
Propagation delay is an important
physical limit. This measure is often
given short shrift in system design as
application architectures evolve, but
may have more performance impact
on real distributed applications than
bandwidth, the most commonly used
figure of merit for networks. Modern
distributed applications require adherence to some rules of thumb to maintain their responsiveness over a wide
range of propagation delays.
References
1. Light Reading. 40-gig router test results; http://
www.lightreading.com/document.asp?doc_
id=63606&page_number= 4&image_number= 9.
2. Mohr, P. J., and Taylor, B. N. CODATA recommended
values of the fundamental physical constants.
Reviews of Modern Physics 77, 1 (2005), 1–107.
3. Partridge, C. Gigabit Networking. Addison-Wesley
Professional, 1994.
4. Shaffer, J.h., and Smith, J. M. A new look at bandwidth
latency tradeoffs. University of Pennsylvania, CIS
TR MS-CIS- 96-10; http://repository.upenn.edu/cgi/
viewcontent.cgi?article=1192&context=cis_reports.
Related articles
on queue.acm.org
You Don’t Know Jack
about Network Performance
Kevin Fall and Steve McCanne
http://queue.acm.org/detail.cfm?id=1066069
Latency and Livelocks
Kode Vicious
http://queue.acm.org/detail.cfm?id=1365494
DNS Complexity
Paul Vixie
http://queue.acm.org/detail.cfm?id=1242499
Jonathan M. Smith is the Olga and Alberico Pompa
Professor of Engineering and Applied Science and a
professor of computer and information science at the
University of Pennsylvania. he served as a program
manager at DARPA from 2004 to 2006 and was
awarded the Office of the Secretary of Defense Medal for
Exceptional Public Service in 2006.
© 2009 ACM 0001-0782/09/0700 $10.00