Ease of programming, or programmability, is a necessary condition for
the success of any many-core platform, and teachability is a necessary
condition for programmability and in
turn for productivity. The teachability
of the XMT approach has been demonstrated extensively; for example, since
2007 more than 100 students in grades
K– 12 have learned to program XMT,
including in two magnet programs:
Montgomery Blair High School, Silver
Spring, MD, and Thomas Jefferson
High School for Science and Technology, Alexandria, VA. 22 Others are Baltimore Polytechnic High School, where
70% of the students are African American, and a summer workshop for mid-dle-school students from underrepresented groups in Montgomery County,
MD, public schools.
In the fall of 2010, I jointly conducted another experiment, this one
via video teleconferencing with Professor David Padua of the University
of Illinois, Urbana-Champaign using
Open MP and XMTC, with XMTC programming assignments run on the
XMT 64-processor FPGA machine.
Our hope was to produce a meaningful comparison of programming development time from the 30 participating Illinois students. The topics
and problems covered in the PRAM/
XMT part of the course were significantly more advanced than Open MP
alone. Having sought to demonstrate
the importance of teachability from
middle school on up, I strongly recommend that it becomes a standard
benchmark for evaluating many-core
hardware platforms.
Blake et al. 4 reported that after ana-
lyzing current desktop/laptop appli-
cations for which the goal was better
performance, the applications tend to
comprise many threads, though few
of them are used concurrently; conse-
quently, the applications fail to trans-
late the increasing thread-level paral-
lelism in hardware to performance
gains. This problem is not surprising
given that most programmers can’t
handle multi-core microprocessors.
In contrast, guided by the simple ICE
abstraction and by the rich PRAM
knowledgebase to find parallelism,
XMT programmers are able to repre-
sent that parallelism using a type of
threading the XMT hardware is engi-
neered to exploit for performance.
acknowledgment
This work is supported by the National Science Foundation under grant
0325393.
References
1. Adve, S. et al. Parallel Computing Research at Illinois:
The UPCRC Agenda. White Paper. University of
Illinois, Champaign-Urbana, IL,2008; http://www.
upcrc.illinois.edu/UPCRC_Whitepaper.pdf
2. Asanovic, K. et al. The Landscape of Parallel
Computing Research: A View from Berkeley. Technical
Report UCB/EECS-2006-183. University of California,
Berkeley, 2006; http://www.eecs.berkeley.edu/Pubs/
TechRpts/2006/EECS-2006-183.pdf
3. Balkan, A., Horak, M., Qu, G., and Vishkin, U. Layout-accurate design and implementation of a high-throughput interconnection network for single-chip
parallel processing. In Proceedings of the 15th Annual
IEEE Symposium on High Performance Interconnects
(Stanford, CA, Aug. 22–24). IEEE Press, Los Alamitos,
CA, 2007.
4. Blake, G., Dreslinski, R., Flautner, K., and Mudge,
T. Evolution of thread-level parallelism in desktop
applications. In Proceedings of the 37th Annual
International Symposium on Computer Architecture
(Saint-Malo, France, June 19–23). ACM Press, New
York, 2010, 302–313.
5. Borkar, S. et al. Platform 2015: Intel Processor and
Platform Evolution for the Next Decade. White Paper.
Intel, Santa Clara, CA, 2005; http://epic.hpi.uni-
potsdam.de/pub/Home/TrendsAndConceptsII2010/
HW_Trends_borkar_2015.pdf
6. Caragea, G., Tzannes, A., Keceli, F., Barua, R., and
Vishkin, U. Resource-aware compiler prefetching for
many-cores. In Proceedings of the Ninth International
Symposium on Parallel and Distributed Computing
(Istanbul, Turkey, July 7–9). IEEE Press, Los
Alamitos, CA, 2010, 133–140.
7. Caragea, G., Keceli, F., Tzannes, A., and Vishkin, U.
General-purpose vs. GPU: Comparison of many-cores on irregular workloads. In Proceedings of the
Second Usenix Workshop on Hot Topics in Parallelism
(University of California, Berkeley, June 14–15).
Usenix, Berkeley, CA, 2010.
8. Caragea, G., Saybasili, B., Wen, X., and Vishkin, U.
Performance potential of an easy-to-program PRAM-on-chip prototype versus state-of-the-art processor.
In Proceedings of the 21st ACM SPAA Symposium on
Parallelism in Algorithms and Architectures (Calgary,
Canada, Aug. 11–13). ACM Press, New York, 2009,
163–165.
9. Cormen, T., Leiserson, C., Rivest, R., and Stein, C.
Introduction to Algorithms, Third Edition. MIT Press,
Cambridge, MA, 2009.
10. Culler, D. and Singh, J. Parallel Computer Architecture:
A Hardware/Software Approach. Morgan-Kaufmann,
San Francisco, CA, 1999.
Uzi Vishkin ( vishkin@umd.edu) is a professor in the
University of Maryland Institute for Advanced Computer
Studies ( http://www.umiacs.umd.edu/~vishkin) and
Electrical and Computer Engineering Department, College
Park, MD.