ACM Transactions on Accessible Computing
◆◆◆◆◆
This quarterly publication is a
quarterly journal that publishes
refereed articles addressing issues
of computing as it impacts the
lives of people with disabilities.
The journal will be of particular
interest to SIGACCESS members
and delegrates to its affiliated
conference (i.e., ASSETS), as well
as other international accessibility
conferences.
◆◆◆◆◆
www.acm.org/taccess
www.acm.org/subscribe
to whom all this might be most beneficial and opening possible directions
for future research.
Rationales
The current era of supercomputing is
referred to as the petascale era. The
next big HPC challenge is to break
the exascale barrier. However, due to
technological limitations, 16, 23 there
is growing agreement that reaching
this goal will require a substantial
shift toward hardware/software co-design. 3, 7, 10, 18 The driving idea behind
the custom dataflow supercomputers
(like the Maxeler solution), falls into
this category: To implement the computational dataflow in a custom hardware accelerator. In order to achieve
maximum performance, the kernel
of the application is compiled into a
dataflow engine. The resulting array
structure can be hundreds to thousands of pipeline stages deep. Ideally,
in the static dataflow form, data can
enter and exit each stage of the pipeline in every cycle. It cannot be precisely itemized what portion of the
improved performance is due to the
dataflow concept, and what portion
is due to customization; this is because the dataflow concept is used as
a vehicle that provides customization
in hardware.
For these dataflow systems, utilization of a relatively slow clock is typical,
while the entire dataflow is completed more efficiently. This slow clock is
not a problem for big data computations, since the speed of computation
depends on pin throughput and local
memory size/bandwidth inside the
computational chip. Even when the
dataflow is implemented using FPGA
chips, and thus the general-purpose
connections in FPGA chips bring a
clock slowdown, this does not affect
the performance: pin throughput
and local memory size/bandwidth
are the bottleneck. The sheer magnitude of the dataflow parallelism can
be used to overcome the initial speed
disadvantage. Therefore, if counting
is oriented to performance measures
correlated with clock speed, these
systems perform poorly. However, if
counting is oriented to performance
measures sensitive to the amount of
data processed, these systems may
perform richly. This is the first issue
the current era of
supercomputing
is referred to as
the petascale era.
the next big hPC
challenge is to break
the exascale barrier.
of importance.
The second important issue is related to the fact that, due to their lower
clock speed, systems based on this
kind of a dataflow approach consume
less power, less space, and less money
compared to systems driven by a fast
clock. Weston24 shows the measured
speedups (31x and 37x) were achieved
while reducing the power consumption of a 1U compute node. Combining
power and performance measures is a
challenge that is already starting to be
addressed by the Green 500 list. However,
evaluating radically different models of
computation such as dataflow remains
yet to be addressed, especially in the
context of total cost of ownership.
In addition to the aforementioned
issues, the third issue of importance is
that systems based on a kind of data-flow approach perform poorly on relatively simple benchmarks, which are
typically not rich in the amount and
variety of data structures. However,
they perform fairly well on relatively
sophisticated benchmarks, rich in the
amount and variety of data structures.
Justification
Performance of an HPC system depends on the adaption of a computational algorithm to the problem, dis-cretization of the problem, mapping
onto data structures and representable
numbers, the dataset size, and the suitability of the underlying architecture
compared to all other choices in the
spectrum of design options. In light of
all these choices, how does one evaluate a computer system’s suitability for
a particular task such as climate modeling or genetic sequencing?