Linux CentOS 5. 4 with the Cloudera
CDH 4. 7.0 distribution of Hadoop 1.0
3 Included in that distribution is the Hadoop-examples.jar
file that contains the code for both
the TeraGen and TeraSort MapReduce jobs. Whirr can read the desired
configuration from a properties file,
as well as receiving properties passed
from the command line. This allowed
permanent storage of the parameters
that did not change (for example, the
operating system version and Amazon
Three sets of performance metrics
˲ The elapsed time for the TeraSort
job (excluding the TeraGen job).
˲ Hadoop-generated job data files.
˲ Linux performance metrics.
Of these, the most important metric was the elapsed time, which was
recorded using the Posix time stamp
in milliseconds (since EC2 hardware
supports it) via the shell command illustrated in Figure 4.
Runtime performance metrics (for
example, memory usage, disk IO rates,
and processor utilization) were captured from each EC2 node using the
resident Linux performance tools up-time, vmstat, and iostat. The performance data was parsed and appended
to a file every two seconds.
A sign of perpetual motion. Figure 5
shows the TeraSort speedup data (dots)
together with the fitted USL scalability
curve (blue). The linear bound (dashed
line) is included for reference. That the
speedup data lies on or above the linear
bound provides immediate visual evidence that scalability is indeed superlinear. Rather than a linear fit,
USL regression curve exhibits a convex
trend near the origin that is consistent
with the generic superlinear profile
in Figure 1.
The entirely unexpected outcome
is that the USL contention coefficient
develops a negative value: σ = −0.0288.
This result also contradicts the asser-
tion8 that both σ and κ must be positive
for physical consistency—the likely
source of the criticism that the USL
failed with superlinear speedup data.
As explained earlier, a positive
value of σ is associated with conten-
tion for shared resources. For exam-
ple, the same processor that executes
user-level tasks may also need to ac-
commodate operating-system tasks
such as IO requests. The processor
capacity is consumed by work other
than the application itself. Therefore,
the application throughput is less
than the expected linear bang for the
Capacity consumption (σ > 0) ac-
counts for the sublinear scalability
component in Figure 2b. Conversely, σ
< 0 can be identified with some kind of
capacity boost. This interpretation will
be explained shortly.
Additionally, the (positive) coher-
ency coefficient κ = 0.000447 means
there must be a peak value in the
speedup, which the USL predicts as
Smax = 73. 48, occurring at p = 48 nodes.
More significantly, it also means the
USL curve must cross the linear bound
and enter the payback region shown
in Figure 6.
The USL model predicts this cross-
over from the superlinear region to
the payback region must occur for
the following reason. Although the
magnitude of σ is small, it is also mul-
tiplied by (p − 1) in equation 2. There-
fore, as the number of nodes increas-
es, the difference, 1 − σ (p − 1), in the
denominator of equation 2 becomes
progressively smaller such that Sp is
eventually dominated by the coher-
ency term, κ p(p − 1).
Figure 7 includes additional speedup measurements (squares). The fitted
USL coefficients are now significantly
smaller than those in Figure 5. The maximum speedup, Smax, therefore is about
30% higher than predicted with the data
Figure 6. Superlinearity and its associated payback region (see Figure 1).
Figure 7. USL analysis of p ≤ 150 BigMem nodes (solid blue curve) with Figure 4 (dashed
blue curve) inset for comparison.
50 100 150
Table 2. Runtime error analysis.
T1 = 13057 ± 606 seconds (r.e. 5%)
T2 = 6347 ± 541 seconds (r.e. 9%)
T3 = 4444 ± 396 seconds (r.e. 9%)
T5 = 2065 ± 147 seconds (r.e. 7%)
T10 = 893 ± 27 seconds (r.e. 3%)