be very difficult to get what you really
need, which is the ability to drill down
into the details of why a given response
time is as large as it is.
Unfortunately, people tend to measure what is easy to measure, which
is not necessarily what they should be
measuring. It’s a bug. Measures that
aren’t what you need, but that are easy
enough to obtain and seem related to
what you need are called surrogate measures. Examples include subroutine call
counts and samples of subroutine call
execution durations.
I’m ashamed that I do not have
greater command over my native language than to say it this way, but here is
a catchy, modern way to express what I
think about surrogate measures:
surrogate measures suck.
Here, unfortunately, suck doesn’t
mean never work. It would actually
be better if surrogate measures never
worked. Then nobody would use them.
The problem is that surrogate measures
work sometimes. This inspires people’s
confidence that the measures they are
using should work all the time, and
then they don’t. Surrogate measures
have two big problems.
˲ ˲ They can tell you your system’s OK
when it is not. That’s what statisticians
call type I error, the false positive.
˲ ˲ They can tell you that something is
a problem when it is not. That’s a type
II error, the false negative. I have seen
each type of error waste years of people’s time.
When the time comes to assess the
specifics of a real system, your success
is at the mercy of how good the measurements are that your system allows
you to obtain. I have been fortunate to
work in the Oracle market segment,
where the software vendor at the center
of our universe participates actively in
making it possible to measure systems
the right way. Getting application software developers to use the tools that Oracle offers is another story, but at least
the capabilities are there in the product.
Performance is a feature
Performance is a software application
feature, just like recognizing that it’s
convenient for a string of the form “Case
1234” to automatically hyperlink over to
case 1234 in your bug-tracking system.
(FogBugz, which is software that I enjoy
using, does this.) Performance, like any
other feature, does not just happen; it
has to be designed and built. To do per-
formance well, you have to think about
it, study it, write extra code for it, test it,
and support it.
acknowledgments
Thank you, Baron Schwartz for the
email conversation in which you
thought I was helping you, but in actual
fact, you were helping me come to grips
with the need for introducing coherency delay more prominently into my
thinking. Thank you, Jeff Holt, Ron Cris-co, Ken Ferlita, and Harold Palacio for
the daily work that keeps the company
going and for the lunchtime conversations that keep my imagination going.
Thank you, Tom Kyte for your continued
inspiration and support. Thank you,
Mark Farnham for your helpful suggestions. And thank you, Neil Gunther for
your patience and generosity in our ongoing discussions about knees.
Related articles
on queue.acm.org
You’re Doing It Wrong
Poul-Henning Kamp
http://queue.acm.org/detail.cfm?id=1814327
Performance Anti-Patterns
Bart Smaalders
http://queue.acm.org/detail.cfm?id=1117403
hidden in Plain Sight
Bryan Cantrill
http://queue.acm.org/detail.cfm?id=1117401
References
1. cMg (computer Measurement group, a network of
professionals who study these problems very, very
seriously); http://www.cmg.org.
2. eight-second rule; http://en.wikipedia.org/wiki/
network_performance#8-second_rule.
3. garvin, d. building a learning organization. Harvard
Business Review (July 1993).
4. gunther, n. universal Law of computational
scalability (1993); http://en.wikipedia.org/wiki/
neil_J._gunther#universal_Law_of_computational_
scalability.
5. Knuth, d. structured programming with go to
statements. ACM Computing Surveys 6, 4 (1974), 268.
6. Kyte, t. a couple of links and an advert…; http://tkyte.
blogspot.com/2009/02/couple-of-links-and-advert.html.
7. Millsap, c. and holt, J. Optimizing Oracle Performance.
o’reilly, sebastopol, ca, 2003.
8. oak table network; http://www.oaktable.net.
Cary Millsap is the founder and president of Method r
corporation ( http://method-r.com), a company devoted to
software performance. he is the author (with Jeff holt) of
Optimizing Oracle Performance (o’reilly) and a co-author
of Oracle Insights: Tales of the Oak Table (apress). he is
the former vice president of oracle corporation’s system
Performance group and a co-founder of his former
company hotsos. he is also an oracle ace director and
a founding partner of the oak table network, an informal
association of well-known “oracle scientists.” Millsap
blogs at http://carymillsap.blogspot.com, and tweets at
http://twitter.com/caryMillsap.