I also recommend Accelerate: The
Science of Lean Software and DevOps:
Building and Scaling High-Performing
Technology Organizations, by Nicole
Forsgren, Jez Humble, and Gene Kim. 6
It provides a CEO view of the science
that makes DevOps work.
Lynn Ballard, Nicole Forsgren, Tim
Bell, Lance A. Brown, Jennifer Davis,
Trent R. Hein, Mark Henderson, Steve
VanDevender, Harald Wagener.
The Age of Corporate
Open Source Enlightenment
Managing Technical Debt
Why Cloud Computing Will Never Be Free
1. Andreessen, M. Why software is eating the world.
The Wall Street J. (Aug. 20, 2011); https://on.wsj.
2. Buranyi, S. Rise of the racist robots—how AI is
learning all our worst impulses. The Guardian (Aug. 8,
3. Dewdney, A.K. Computer recreations: of worms,
viruses and core war. Scientific American 260, 3
4. Dickson, C. L. Why your manager loves technical
debt and what to do about it. In Proceedings of
the Usenix LISA Conference, 2015; https://www.
5. Fong-Jones, L. Twitter, 2018; https://twitter.com/
6. Forsgren, N., Humble, J., and Kim, G. Accelerate: The
Science of Lean Software and DevOps: Building and
Scaling High Performing Technology Organizations.
IT Revolution Press, 2018.
7. Fowler, M. Technical debt. Martinfowler.com, 2003;
8. Fram Oil Filter commercial. 1972; https://www.
9. Gartner. Gartner predicts, 2016; https://gtnr.it/2YLtXaF
10. Kim, G., Behr, K. and Spafford, G. The Phoenix Project:
A Novel About I T, DevOps, and Helping Your Business
Win. IT Revolution Press, 2013; https://www.goodreads.
11. Snover, J. Digital transformation: thriving through the
transition. DevOps Enterprise Summit, 2018; https://
12. Wikipedia. High availability; https://en.wikipedia.org/
13. Wikipedia. Non-functional requirement; https://
14. Zwieback, D. Beyond Blame: Learning from Failure and
Success. O’Reilly Media, 2015; https://www.goodreads.
Thomas A. Limoncelli is the SRE manager at Stack
Overflow Inc. in New York City. His books include The
Practice of System and Network Administration, The
Practice of Cloud System Administration, and Time
Management for System Administrators. He blogs at
EverythingSysadmin.com and tweets at @Yes That Tom.
Copyright held by author/owner.
Publication rights licensed to ACM. $15.00
process that sucks to one that is awesome. In pathological cases the process is nonexistent—each silo improvising and guessing its way through
the process each time the crank turns.
The result of the First Way is improved
velocity and reduced defects. Things
˲ The second way is “amplify feedback
loops.” The focus is on improving communication among the people and
components within the system. Communication is a feedback loop and
should be bidirectional, responsive,
transparent, and blameless. A system
cannot work well without the ability
of the people involved to learn, share,
and grow. The Second Way is about
driving improvements that move you
from communication that is lacking
to communication that is comprehensive. In pathological cases communication is punished. The result of the
Second Way is understanding, empathy, and responsiveness to customers
both internal and external. Knowledge
is where it is needed.
˲ The third way is a “culture of continual experimentation and learning.”
This is where you focus on creating
a culture where you try new things,
evaluate the results, and decide
whether to keep or revert the change.
The Third Way is about going from a
culture where change is resisted to
one where change is constant. Risk
is accepted. Rituals reward teams
for taking risks and learning from
failure. In pathological cases the organization is calcified: change isn’t
possible, suggestions for change are
rejected or possibly punished. The
result of the Third Way is evolutionary change over time, punctuated by
major leaps and innovation.
Wait, there is more …
Indeed, there are volumes more that an
executive should know about software.
Sadly, cultural pressure and David Letterman say I should stop at 10.
Here are some bonus items:
Bonus item 1.
Uptime is never perfect.
Asking for 100% uptime makes you
look ignorant. Each order of magni-
tude of improvement costs ludicrously
more than the level prior: 99.0% uptime
is fine for plenty of systems; 99.999% is
more expensive than you can afford.
Punishing people for downtime sends
the wrong message. Instead, ask “What
did we learn?” If your organization
learned something, the downtime was
a gift. Recommended reading: Beyond
Blame, Learning from Failure and Suc-
cess, by Dave Zwieback, 14 and Wikipe-
dia’s page on High Availability. 12
Bonus item 2. Spammers
and abusers ruin everything.
Fighting spam and abuse is an arms
race. If you can build an online app in a
week, you will spend a year figuring out
how to prevent spammers from ruining
it. Google Sheets has anti-abuse detection because criminals make spread-sheets full of links to scams and then
send the links to people who think any
link that mentions Google is safe. The
amount of anti-abuse work required to
run online communities such as Twitter, Facebook, or other social networks
would make you cry.
Bonus item 3.
Malleability is expensive.
Some changes to software require a
new release, while other changes can
happen while the system is running.
The latter is expensive. It would be easy
for Facebook profiles to store only your
name, location, and a few other facts.
The ability to store any field is an expensive engineering task. Be careful when
asking for flexibility. It affects testing,
security, usability, and a lot more.
Software is eating the world. To do
their jobs well, executives and managers outside of technology will
benefit from understanding some
fundamentals of software and the
Further resources. If you are an executive who wants software acumen,
there are many resources. The first is
your VP of engineering or CTO. Ask the
person in one of these jobs what you
I also highly recommend reading
The Phoenix Project: A Novel about IT,
DevOps, and Helping Your Business Win,
by Gene Kim, Kevin Behr, and George
Spafford. 10 It provides an inside view
of IT and a practical understanding
of how to use DevOps techniques to