the factor that you can use to model the
imperfection. 4 It is the duration that a
task spends communicating and coordinating access to a shared resource.
Like response time, service time, and
queuing delay, coherency delay is measured in time per task execution, as in
seconds per click.
I will not describe a mathematical
model for predicting coherency delay,
but the good news is that if you profile
your software task executions, you’ll
see it when it occurs. In Oracle, timed
events such as the following are examples of coherency delay:
˲ ˲ enqueue
˲ ˲ buffer busy waits
˲ ˲ latch free
You can not model such coherency
delays with M/M/m. That is because
M/M/m assumes all m of your service
channels are parallel, homogeneous,
and independent. That means the
model assumes that after you wait po-
litely in a FIFO queue for long enough
that all the requests that enqueued
ahead of you have exited the queue for
service, it will be your turn to be ser-
viced. Coherency delays don’t work like
that, however.
Imagine an HTML data-entry form
in which one button labeled “Update”
executes a SQL update statement, and
another button labeled “Save” executes
a SQL commit statement. An application built like this would almost guarantee abysmal performance. That is
because the design makes it possible—
quite likely, actually—for a user to click
Update, look at his calendar, realize
“uh-oh, I’m late for lunch,” and then go
to lunch for two hours before clicking
Save later that afternoon.
The impact to other tasks on this
system that wanted to update the same
row would be devastating. Each task
would necessarily wait for a lock on
the row (or, on some systems, worse: a
lock on the row’s page) until the locking user decided to go ahead and click
Save—or until a database administrator killed the user’s session, which of
course would have unsavory side effects to the person who thought he had
updated a row.
In this case, the amount of time a
task would wait on the lock to be released has nothing to do with how busy
the system is. It would be dependent
upon random factors that exist outside
all this talk
of queuing delays
and coherency
delays leads
to a very difficult
question:
how can you
possibly test
a new application
enough to be
confident that
you are not
going to wreck
your production
implementation
with performance
problems?
of the system’s various resource utilizations. That is why you can not model
this kind of thing in M/M/m, and it is
why you can never assume that a performance test executed in a unit-test-ing type of environment is sufficient
for a making a go/no-go decision about
insertion of new code into a production system.
Performance testing
All this talk of queuing delays and coherency delays leads to a very difficult
question: How can you possibly test a
new application enough to be confident
that you are not going to wreck your
production implementation with performance problems?
You can model. And you can test. 1
Nothing you do will be perfect, however.
It is extremely difficult to create models
and tests in which you’ll foresee all your
production problems in advance of actually encountering those problems in
production.
Some people allow the apparent futility of this observation to justify not
testing at all. Do not get trapped in that
mentality. The following points are
certain:
˲ ˲ You will catch a lot more problems
if you try to catch them prior to production than if you do not even try.
˲ ˲ You will never catch all your problems in preproduction testing. That is
why you need a reliable and efficient
method for solving the problems that
leak through your preproduction testing processes.
Somewhere in the middle between
“no testing” and “complete production emulation” is the right amount
of testing. The right amount of testing
for aircraft manufacturers is probably
more than the right amount of testing
for companies that sell baseball caps.
But don’t skip performance testing altogether. At the very least, your perfor-mance-test plan will make you a more
competent diagnostician (and clearer
thinker) when the time comes to fix the
performance problems that will inevitably occur during production operation.
Measuring. People feel throughput
and response time. Throughput is usually easy to measure, response time is much
more difficult. (Remember, throughput
and response time are not reciprocals.)
It may not be difficult to time an end-us-er action with a stopwatch, but it might