plicated tasks (such as those listed earlier) than the selection and aggregation
that are SQL’s forte.
structured Data and schemas
Pavlo et al. did raise a good point that
schemas are helpful in allowing multiple applications to share the same data.
For example, consider the following
schema from the comparison paper:
CREATE TABLE Rankings (
pageURL VARCHAR( 100)
PRIMARY KEY,
pageRank INT,
avgDuration INT );
The corresponding Hadoop benchmarks in the comparison paper used
an inefficient and fragile textual format with different attributes separated
by vertical bar characters:
137| http://www.somehost.com/
index.html|602
In contrast to ad hoc, inefficient
formats, virtually all MapReduce operations at Google read and write data
in the Protocol Buffer format.
8 A high-level language describes the input and
output types, and compiler-generated
code is used to hide the details of en-coding/decoding from application
code. The corresponding protocol buffer description for the Rankings data
would be:
message Rankings {
required string pageurl = 1;
required int32 pagerank = 2;
required int32 avgduration = 3;
}
The following Map function fragment processes a Rankings record:
Rankings r = new Rankings();
r.parseFrom(value);
if ( r.getPagerank() > 10) { ... }
The protocol buffer framework
allows types to be upgraded (in constrained ways) without requiring existing applications to be changed (or even
recompiled or rebuilt). This level of
schema support has proved sufficient
for allowing thousands of Google engineers to share the same evolving data
types.
Furthermore, the implementation
mapReduce is
a highly effective
and efficient
tool for large-scale
fault-tolerant
data analysis.
of protocol buffers uses an optimized
binary representation that is more
compact and much faster to encode
and decode than the textual formats
used by the Hadoop benchmarks in the
comparison paper. For example, the
automatically generated code to parse
a Rankings protocol buffer record
runs in 20 nanoseconds per record as
compared to the 1,731 nanoseconds
required per record to parse the textual input format used in the Hadoop
benchmark mentioned earlier. These
measurements were obtained on a JVM
running on a 2.4GHz Intel Core- 2 Duo.
The Java code fragments used for the
benchmark runs were:
// Fragment 1: protocol buffer parsing
for (int i = 0; i < numItera-tions; i++) {
rankings.parseFrom(value);
pagerank = rankings.get-Pagerank();
}
// Fragment 2: text format parsing (extracted from
Benchmark1.java
// from the source code
posted by Pavlo et al.)
for (int i = 0; i < numItera-tions; i++) {
String data[] = value.to-
String().split(“\\|”);
pagerank = Integer.
valueOf(data[0]);
}
Given the factor of an 80-fold difference in this record-parsing benchmark, we suspect the absolute numbers for the Hadoop benchmarks in
the comparison paper are inflated and
cannot be used to reach conclusions
about fundamental differences in the
performance of MapReduce and parallel DBMS.
fault Tolerance
The MapReduce implementation uses
a pull model for moving data between
mappers and reducers, as opposed to
a push model where mappers write directly to reducers. Pavlo et al. correctly
pointed out that the pull model can result in the creation of many small files
and many disk seeks to move data between mappers and reducers. Imple-