including machine status, connectivity to other machines in the network, and monitoring capabilities. When connectivity of a local lead server degrades, for example, a new server is automatically elected to assume the role of leader.
Principle 4: Fail cleanly and restart. Based on the previous principles, the network has already been architected to handle server failures quickly and seamlessly, so we are able to take a more aggressive approach to failing problematic servers and restarting them from a last known good state. This sharply reduces the risk of operating in a potentially corrupted state. If a given machine continues to require restarting, we simply put it into a “long sleep” mode to minimize impact to the overall network.
Principle 5: Phase software releases. After passing the quality assurance (QA) process, software is released to the live network in phases. It is first deployed to a single machine. Then, after performing the appropriate checks, it is deployed to a single region, then possibly to additional subsets of the network, and finally to the entire network. The nature of the release dictates how many phases and how long each one lasts. The previous principles, particularly use of redundancy, distributed control, and aggressive restarts, make it possible to deploy software releases frequently and safely using this phased approach.
Principle 6: Notice and proactively quarantine faults. The ability to isolate faults, particularly in a recovery-orient-ed computing system, is perhaps one of the most challenging problems and an area of important ongoing research. Here is one example. Consider a hypothetical situation where requests for a certain piece of content with a rare set of configuration parameters trigger a latent bug. Automatically failing the servers affected is not enough, as requests for this content will then be directed to other machines, spreading the problem. To solve this problem, our caching algorithms constrain each set of content to certain servers so as to limit the spread of fatal requests. In general, no single customer’s content footprint should dominate any other customer’s footprint among available servers. These constraints are dynamically determined based on current lev-
els of demand for the content, while keeping the network safe.
Besides the inherent fault-tolerance benefits, a system designed around these principles offers numerous other benefits.
Faster software rollouts. Because the network absorbs machine and regional failures without impact, Akamai is able to safely but aggressively roll out new software using the phased rollout approach. As a benchmark, we have historically implemented approximately 22 software releases and 1,000 customer configuration releases per month to our worldwide network, without disrupting our always-on services.
Minimal operations overhead. A large, highly distributed, Internet-based network can be very difficult to maintain, given its sheer size, number of network partners, heterogeneous nature, and diversity of geographies, time zones, and languages. Because the Akamai network design is based on the assumption that components will fail, however, our operations team does not need to be concerned about most failures. In addition, the team can aggressively suspend machines or data centers if it sees any slightly worrisome behavior. There is no need to rush to get components back online right away, as the network absorbs the component failures without impact to overall service.
This means that at any given time, it takes only eight to 12 operations staff members, on average, to manage our network of approximately 40,000 devices (consisting of more than 35,000 servers plus switches and other networking hardware). Even at peak times, we successfully manage this global, highly distributed network with fewer than 20 staff members.
Lower costs, easier to scale. In addition to the minimal operational staff needed to manage such a large network, this design philosophy has had several implications that have led to reduced costs and improved scalability. For example, we use commodity hardware instead of more expensive, more reliable servers. We deploy in third-party data centers instead of having our own. We use the public Internet instead of having dedicated links. We
deploy in greater numbers of smaller regions—many of which host our servers for free—rather than in fewer, larger, more “reliable” data centers where congestion can be greatest.
Even though we’ve seen dramatic advances in the ubiquity and usefulness of the Internet over the past decade, the real growth in bandwidth-intensive Web content, rich media, and Web-and IP-based applications is just beginning. The challenges presented by this growth are many: as businesses move more of their critical functions online, and as consumer entertainment (games, movies, sports) shifts to the Internet from other broadcast media, the stresses placed on the Internet’s middle mile will become increasingly apparent and detrimental. As such, we believe the issues raised in this article and the benefits of a highly distributed approach to content delivery will only grow in importance as we collectively work to enable the Internet to scale to the requirements of the next generation of users.
References
1. afergan, M., Wein, J., laMeyer, a. experience with some principles for building an internet-scale reliable system. in Proceedings of the 2nd Conference on Real, Large Distributed Systems 2. (these principles are laid out in more detail in this 2005 research paper.)
2. akamai report: the state of the internet, 2nd quarter, 2008; http://www.akamai.com/stateoftheinternet/. (these and other recent internet reliability events are discussed in akamai’s quarterly report.)
3. anderson, n. comcast at ces: 100 Mbps connections coming this year. ars technica (Jan. 8, 2008); http:// arstechnica.com/news.ars/post/20080108-comcast- 100mbps-connections-coming-this-year.html.
4. horrigan, J.b. home broadband adoption 2008. Pew
internet and american life Project; http://www.
pewinternet.org/pdfs/PiP_broadband_2008.pdf.
5. internet World statistics. broadband internet statistics: top World countries with highest internet broadband subscribers in 2007; http://www. internetworldstats.com/dsl.htm.
6. Mehta, s. verizon’s big bet on fiber optics. Fortune (feb. 22, 2007); http://money.cnn.com/magazines/ fortune/fortune_archive/2007/03/05/8401289/.
7. spangler t. at&t: U-verse tv spending to increase. Multichannel News (May 8, 2007); http://www. multichannel.com/article/ca6440129.html.
8. telegeography. cable cuts disrupt internet in Middle east and india. CommsUpdate (Jan. 31, 2008); http:// www.telegeography.com/cu/article.php?article_ id=21528.
Tom Leighton co-founded akamai technologies in august 1998. serving as chief scientist and as a director to the board, he is akamai’s technology visionary, as well as a key member of the executive committee setting the company’s direction. he is an authority on algorithms for network applications. leighton is a fellow of the american academy of arts and sciences, the national academy of science, and the national academy of engineering. a previous version of this article appeared in the october 2008 issue of ACM Queue magazine.
© 2009 acM 0001-0782/09/0200 $5.00
feBRuaRY 2009 | vol. 52 | No. 2 | CommunICatIons of the aCm
51
References:
http://www.akamai.com/stateoftheinternet/
http://arstechnica.com/news.ars/post/20080108-comcast-100mbps-connections-coming-this-year.html
http://arstechnica.com/news.ars/post/20080108-comcast-100mbps-connections-coming-this-year.html
http://arstechnica.com/news.ars/post/20080108-comcast-100mbps-connections-coming-this-year.html
http://www.pewinternet.org/pdfs/PiP_broadband_2008.pdf
http://www.pewinternet.org/pdfs/PiP_broadband_2008.pdf
http://www.internetworldstats.com/dsl.htm
http://www.internetworldstats.com/dsl.htm
http://money.cnn.com/magazines/fortune/fortune_archive/2007/03/05/8401289/
http://money.cnn.com/magazines/fortune/fortune_archive/2007/03/05/8401289/
http://www.multichannel.com/article/ca6440129.html
http://www.multichannel.com/article/ca6440129.html
http://www.telegeography.com/cu/article.php?article_id=21528
http://www.telegeography.com/cu/article.php?article_id=21528
http://www.telegeography.com/cu/article.php?article_id=21528
Archives