• The indexing process has become much easier to operate because
most of the problems caused by machine failures, slow machines,
and networking hiccups are dealt with automatically by the MapReduce library without operator intervention. Furthermore, it is easy to
improve the performance of the indexing process by adding new machines to the indexing cluster.
Josh Levenberg has been instrumental in revising and extending the user-level MapReduce API with a number of new features. We would like to especially thank others who have worked on the system and all the users of
MapReduce in Google’s engineering organization for providing helpful
feedback, suggestions, and bug reports.
The MapReduce programming model has been successfully used at
Google for many different purposes. We attribute this success to several
reasons. First, the model is easy to use, even for programmers without experience with parallel and distributed systems, since it hides the details
of parallelization, fault tolerance, locality optimization, and load balancing. Second, a large variety of problems are easily expressible as MapReduce computations. For example, MapReduce is used for the generation
of data for Google’s production Web search service, for sorting, data mining, machine learning, and many other systems. Third, we have developed
an implementation of MapReduce that scales to large clusters of machines comprising thousands of machines. The implementation makes
efficient use of these machine resources and therefore is suitable for use
on many of the large computational problems encountered at Google.
By restricting the programming model, we have made it easy to parallelize and distribute computations and to make such computations
fault tolerant. Second, network bandwidth is a scarce resource. A
number of optimizations in our system are therefore targeted at reducing the amount of data sent across the network: the locality optimization allows us to read data from local disks, and writing a single copy
of the intermediate data to local disk saves network bandwidth. Third,
redundant execution can be used to reduce the impact of slow
machines, and to handle machine failures and data loss.
1. Hadoop: Open source implementation of MapReduce. http://
2. The Phoenix system for MapReduce programming. http://csl.
Arpaci-Dusseau, A. C., Arpaci-Dusseau, R. H., Culler, D. E., Heller-stein, J. M., and Patterson, D. A. 1997. High-performance sorting on
networks of workstations. In Proceedings of the 1997 ACM SIGMOD
International Conference on Management of Data. Tucson, AZ.
4 Barroso, L. A., Dean, J., and Urs Hölzle, U. 2003. Web search for a
planet: The Google cluster architecture. IEEE Micro 23, 2, 22-28.
Bent, J., Thain, D., Arpaci-Dusseau, A. C., Arpaci-Dusseau, R. H.,
and Livny, M. 2004. Explicit control in a batch-aware distributed file
system. In Proceedings of the 1st USENIX Symposium on Networked
Systems Design and Implementation (NSDI).
6. Blelloch, G. E. 1989. Scans as primitive parallel operations. IEEE
Trans. Comput. C- 38, 11.
Chu, C.-T., Kim, S. K., Lin, Y. A., Yu, Y., Bradski, G., Ng, A., and
Olukotun, K. 2006. Map-Reduce for machine learning on multicore.
In Proceedings of Neural Information Processing Systems Conference
(NIPS). Vancouver, Canada.
8. Dean, J. and Ghemawat, S. 2004. MapReduce: Simplified data processing on large clusters. In Proceedings of Operating Systems Design
and Implementation (OSDI ). San Francisco, CA. 137-150.
Fox, A., Gribble, S. D., Chawathe, Y., Brewer, E. A., and Gauthier, P.
1997. Cluster-based scalable network services. In Proceedings of the
16th ACM Symposium on Operating System Principles. Saint-Malo,
10. Ghemawat, S., Gobioff, H., and Leung, S.-T. 2003. The Google file
system. In 19th Symposium on Operating Systems Principles. Lake
George, NY. 29-43.
11. Gorlatch, S. 1996. Systematic efficient parallelization of scan and
other list homomorphisms. In L. Bouge, P. Fraigniaud, A. Mignotte,
and Y. Robert, Eds. Euro-Par’ 96. Parallel Processing, Lecture Notes in
Computer Science, vol. 1124. Springer-Verlag. 401-408
12. Gray, J. Sort benchmark home page. http://research.microsoft.com/
13. Huston, L., Sukthankar, R., Wickremesinghe, R., Satyanarayanan, M.,
Ganger, G. R., Riedel, E., and Ailamaki, A. 2004. Diamond: A storage
architecture for early discard in interactive search. In Proceedings of
the 2004 USENIX File and Storage Technologies FAST Conference.
14. Ladner, R. E., and Fischer, M. J. 1980. Parallel prefix computation.
JACM 27, 4. 831-838.
15. Rabin, M. O. 1989. Efficient dispersal of information for security,
load balancing and fault tolerance. JACM 36, 2. 335-348.
16. Ranger, C., Raghuraman, R., Penmetsa, A., Bradski, G., and
Kozyrakis, C. 2007. Evaluating mapreduce for multi-core and multiprocessor systems. In Proceedings of 13th International Symposium on
High-Performance Computer Architecture (HPCA). Phoenix, AZ.
17. Riedel, E., Faloutsos, C., Gibson, G. A., and Nagle, D. Active disks
for large-scale data processing. IEEE Computer. 68-74.
7 Related Work
Many systems have provided restricted programming models and used
the restrictions to parallelize the computation automatically. For example,
an associative function can be computed over all prefixes of an N element
array in log N time on N processors using parallel prefix computations [ 6, 3.
11, 14]. MapReduce can be considered a simplification and distillation of
some of these models based on our experience with large real-world computations. More significantly, we provide a fault-tolerant implementation
that scales to thousands of processors. In contrast, most of the parallel
processing systems have only been implemented on smaller scales and
leave the details of handling machine failures to the programmer. 5.
Our locality optimization draws its inspiration from techniques
such as active disks [ 13, 17], where computation is pushed into processing elements that are close to local disks, to reduce the amount of
data sent across I/O subsystems or the network.
The sorting facility that is a part of the MapReduce library is similar in operation to NOW-Sort [ 3]. Source machines (map workers) 7.
partition the data to be sorted and send it to one of R reduce workers.
Each reduce worker sorts its data locally (in memory if possible). Of
course NOW-Sort does not have the user-definable map and reduce
functions that make our library widely applicable.
BAD-FS [ 5] and TACC [ 9] are two other systems that rely on reexecution as a mechanism for implementing fault tolerance.