References
1. Beck, K., Beedle, M., Van Bennekum, A., Cockburn, A.,
Cunningham, W., Fowler, M. . . . and Kern, J. Manifesto
for Agile Software Development, 2001; https://
agilemanifesto.org/
2. Bhandarkar, D. and Clark, D. W. Performance from
architecture: Comparing a RISC and a CISC with
similar hardware organization. In Proceedings of the
Fourth International Conference on Architectural
Support for Programming Languages and Operating
Systems (Santa Clara, CA, Apr. 8–11). ACM Press,
New York, 1991, 310–319.
3. Chaitin, G. et al. Register allocation via coloring.
Computer Languages 6, 1 (Jan. 1981), 47–57.
4. Dally, W. et al. Hardware-enabled artificial intelligence.
In Proceedings of the Symposia on VLSI Technology
and Circuits (Honolulu, HI, June 18–22). IEEE Press,
2018, 3–6.
5. Dennard, R. et al. Design of ion-implanted MOSFETs
with very small physical dimensions. IEEE Journal of
Solid State Circuits 9, 5 (Oct. 1974), 256–268.
6. Emer, J. and Clark, D. A characterization of processor
performance in the VAX-11/780. In Proceedings
of the 11th International Symposium on Computer
Architecture (Ann Arbor, MI, June). ACM Press, New
York, 1984, 301–310.
7. Fisher, J. The VLI W machine: A multiprocessor for
compiling scientific code. Computer 17, 7 (July 1984),
45–53.
8. Fitzpatrick, D. T., Foderaro, J.K., Katevenis, M.G.,
Landman, H. A., Patterson, D. A., Peek, J. B., Peshkess,
Z., Séquin, C.H., Sherburne, R. W., and Van Dyke, K.S. A
RISCy approach to VLSI. ACM SIGARCH Computer
Architecture News 10, 1 (Jan. 1982), 28–32.
9. Flynn, M. Some computer organizations and their
effectiveness. IEEE Transactions on Computers 21, 9
(Sept. 1972), 948–960.
10. Fowers, J. et al. A configurable cloud-scale DNN
processor for real-time AI. In Proceedings of the
45th ACM/IEEE Annual International Symposium on
Computer Architecture (Los Angeles, CA, June 2–6).
IEEE, 2018, 1–14.
11. Hennessy, J. and Patterson, D. A New Golden Age for
Computer Architecture. Turing Lecture delivered at
the 45th ACM/IEEE Annual International Symposium
on Computer Architecture (Los Angeles, CA, June 4,
2018); http://iscaconf.org/isca2018/turing_lecture.html;
https://www.youtube.com/watch?v=3LVeEjsn8Ts
12. Hennessy, J., Jouppi, N., Przybylski, S., Rowen,
C., Gross, T., Baskett, F., and Gill, J. MIPS: A
microprocessor architecture. ACM SIGMICRO
Newsletter 13, 4 (Oct. 5, 1982), 17–22.
13. Hennessy, J. and Patterson, D. Computer Architecture:
A Quantitative Approach. Morgan Kauffman, San
Francisco, CA, 1989.
14. Hill, M. A primer on the meltdown and Spectre
hardware security design flaws and their important
implications, Computer Architecture Today blog (Feb.
15, 2018); https://www.sigarch.org/a-primer-on-the-meltdown-spectre-hardware-security-design-flaws-and-their-important-implications/
15. Hopkins, M. A critical look at IA-64: Massive
resources, massive ILP, but can it deliver?
Microprocessor Report 14, 2 (Feb. 7, 2000), 1–5.
16. Horowitz M. Computing’s energy problem (and what
we can do about it). In Proceedings of the IEEE
International Solid-State Circuits Conference Digest of
Technical Papers (San Francisco, CA, Feb. 9–13). IEEE
Press, 2014, 10–14.
17. Jouppi, N., Young, C., Patil, N., and Patterson, D. A
domain-specific architecture for deep neural networks.
Commun. ACM 61, 9 (Sept. 2018), 50–58.
18. Jouppi, N. P., Young, C., Patil, N., Patterson, D., Agrawal,
G., Bajwa, R., Bates, S., Bhatia, S., Boden, N., Borchers,
A., and Boyle, R. In-datacenter performance analysis
of a tensor processing unit. In Proceedings of the
44th ACM/IEEE Annual International Symposium on
Computer Architecture (Toronto, ON, Canada, June
24–28). IEEE Computer Society, 2017, 1–12.
19. Kloss, C. Nervana Engine Delivers Deep Learning at
Ludicrous Speed. Intel blog, May 18, 2016;
https://ai.intel.com/nervana-engine-delivers-deep-learning-at-ludicrous-speed/
20. Knuth, D. The Art of Computer Programming:
Fundamental Algorithms, First Edition. Addison
Wesley, Reading, MA, 1968.
21. Knuth, D. and Binstock, A. Interview with Donald
Knuth. InformI T, Hoboken, NJ, 2010; http://www.
informit.com/articles/article.aspx
22. Kung, H. and Leiserson, C. Systolic arrays (for VLSI).
Chapter in Sparse Matrix Proceedings Vol. 1. Society
for Industrial and Applied Mathematics, Philadelphia,
PA, 1979, 256–282.
23. Lee, Y., Waterman, A., Cook, H., Zimmer, B., Keller,
B., Puggelli, A. . . . and Chiu, P. An agile approach to
building RISC-V microprocessors. IEEE Micro 36, 2
(Feb. 2016), 8–20.
24. Leiserson, C. et al. There’s plenty of room at the top.
To appear.
25. Metz, C. Big bets on A.I. open a new frontier for chip
start-ups, too. The New York Times (Jan. 14, 2018).
26. Moore, G. Cramming more components onto
integrated circuits. Electronics 38, 8 (Apr. 19, 1965),
56–59.
27. Moore, G. No exponential is forever: But ‘forever’ can
be delayed! [semiconductor industry]. In Proceedings
of the IEEE International Solid-State Circuits
Conference Digest of Technical Papers (San Francisco,
CA, Feb. 13). IEEE, 2003, 20–23.
28. Moore, G. Progress in digital integrated electronics. In
Proceedings of the International Electronic Devices
Meeting (Washington, D.C., Dec.). IEEE, New York,
1975, 11–13.
29. Nvidia. Nvidia Deep Learning Accelerator (NVDLA),
2017; http://nvdla.org/
30. Patterson, D. How Close is RISC-V to RISC-I?
ASPIRE blog, June 19, 2017; https://aspire.eecs.
berkeley.edu/2017/06/how-close-is-risc-v-to-risc-i/
31. Patterson, D. RISCy history. Computer Architecture
Today blog, May 30, 2018; https://www.sigarch.org/
riscy-history/
32. Patterson, D. and Waterman, A. The RISC-V Reader:
An Open Architecture Atlas. Strawberry Canyon LLC,
San Francisco, CA, 2017.
33. Rowen, C., Przbylski, S., Jouppi, N., Gross, T.,
Shott, J., and Hennessy, J. A pipelined 32b NMOS
microprocessor. In Proceedings of the IEEE
International Solid-State Circuits Conference Digest
of Technical Papers (San Francisco, CA, Feb. 22–24)
IEEE, 1984, 180–181.
34. Schwarz, M., Schwarzl, M., Lipp, M., and Gruss, D.
Netspectre: Read arbitrary memory over network. arXiv
preprint, 2018; https://arxiv.org/pdf/1807.10535.pdf
35. Sherburne, R., Katevenis, M., Patterson, D., and Sequin,
C. A 32b NMOS microprocessor with a large register
file. In Proceedings of the IEEE International Solid-State Circuits Conference (San Francisco, CA, Feb.
22–24). IEEE Press, 1984, 168–169.
36. Thacker, C., MacCreight, E., and Lampson, B. Alto:
A Personal Computer. CSL- 79-11, Xerox Palo Alto
Research Center, Palo Alto, CA, Aug. 7,1979; http://
people.scs.carleton.ca/~soma/distos/fall2008/alto.pdf
37. Turner, P., Parseghian, P., and Linton, M. Protecting
against the new ‘L1TF’ speculative vulnerabilities.
Google blog, Aug. 14, 2018; https://cloud.google.com/
blog/products/gcp/protectingagainst-the-new-l1tf-
speculative-vulnerabilities
38. Van Bulck, J. et al. Foreshadow: Extracting the keys
to the Intel SGX kingdom with transient out-of-order
execution. In Proceedings of the 27th USENIX Security
Symposium (Baltimore, MD, Aug. 15–17). USENIX
Association, Berkeley, CA, 2018.
39. Wilkes, M. and Stringer, J. Micro-programming and the
design of the control circuits in an electronic digital
computer. Mathematical Proceedings of the Cambridge
Philosophical Society 49, 2 (Apr. 1953), 230–238.
40. XLA Team. XLA – TensorFlow. Mar. 6, 2017; https://
developers.googleblog.com/2017/03/xlatensorflow-
compiled.html
John L. Hennessy ( hennnessy@stanford.edu) is
Past-President of Stanford University, Stanford, CA, USA,
and is Chairman of Alphabet Inc., Mountain View, CA, USA.
David A. Patterson ( pattrsn@berkeley.edu) is the Pardee
Professor of Computer Science, Emeritus at the University
of California, Berkeley, CA, USA, and a Distinguished
Engineer at Google, Mountain View, CA, USA.
© 2019 ACM 0001-0782/19/2 $15.00
back to measure, run real programs,
and show to their friends and family is
a great joy of hardware design.
Many researchers assume they must
stop short because fabricating chips is
unaffordable. When designs are small,
they are surprisingly inexpensive. Architects can order 100 1-mm2 chips for only
$14,000. In 28 nm, 1 mm2 holds millions
of transistors, enough area for both a
RISC-V processor and an NVLDA accelerator. The outermost level is expensive
if the designer aims to build a large chip,
but an architect can demonstrate many
novel ideas with small chips.
Conclusion
“The darkest hour is just before the
dawn.” —Thomas Fuller, 1650
To benefit from the lessons of history, architects must appreciate that
software innovations can also inspire
architects, that raising the abstraction
level of the hardware/software interface
yields opportunities for innovation, and
that the marketplace ultimately settles
computer architecture debates. The
iAPX-432 and Itanium illustrate how
architecture investment can exceed returns, while the S/360, 8086, and ARM
deliver high annual returns lasting decades with no end in sight.
The end of Dennard scaling and
Moore’s Law and the deceleration of performance gains for standard microprocessors are not problems that must be
solved but facts that, recognized, offer
breathtaking opportunities. High-level,
domain-specific languages and architectures, freeing architects from the
chains of proprietary instruction sets,
along with demand from the public for
improved security, will usher in a new
golden age for computer architects.
Aided by open source ecosystems, agilely developed chips will convincingly
demonstrate advances and thereby
accelerate commercial adoption. The
ISA philosophy of the general-purpose
processors in these chips will likely be
RISC, which has stood the test of time.
Expect the same rapid improvement as
in the last golden age, but this time in
terms of cost, energy, and security, as
well as in performance.
The next decade will see a Cambrian explosion of novel computer architectures, meaning exciting times for
computer architects in academia and
in industry.
To watch Hennessy and
Patterson’s full Turing Lecture, see
https://www.acm.org/hennessy-patterson-turing-lecture