fractionally greater than that of retrieving just the identifier once all the database overhead of record location is
calculated.
CACHE REFERENCES AND EVICTION
Developers of Java and other object-oriented languages
are very familiar with the way that garbage collectors
work in the virtual machine. Objects that are no longer
referenced by live objects—those associated with an active
execution context—become garbage, and the memory
they occupy is reclaimed for reuse.
One of the primary duties of a shared cache is to hold
on to state that is no longer referenced by live objects,
thereby preventing it from being garbage-collected. In
other cases, the cache should be configured to let go of
objects that are no longer needed. Ideally, a cache would
know exactly when an object will no longer be needed
or if it will be accessed in the near term and should be
kept around. Unfortunately, a cache cannot be expected
to predict the future, and it falls to the user to configure
how the cache references objects based on what the user
knows about the access patterns of the application. Adaptive strategies do exist where caches attempt to be “
intelligent” and adapt the caching strategy based on previously
observed access patterns, but these strategies are beyond
the scope of this article.
The way that a cache references its cached state is
typically highly configurable. The parameters are based
on the conventional memory-management concepts
of soft and weak referencing. (We are discussing traditional ORM, not realtime systems that must impose strict
control over the number of instances and garbage-col-lection periods that occur.) Recall that weak references
are those that point to objects that the garbage collector
may reclaim if no other regular or hard references are
pointing to them. Soft references are those that point to
objects that can be reclaimed if the virtual machine really
needs more heapspace (and there are no hard references
to the objects). Combining the two reference types in the
same cache and migrating references from one type to
the other can offer a dynamic balance that adjusts to the
needs of both the application and the virtual machine,
but gives preferential treatment to the application.
Cache eviction policies also vary, with options that
include time-to-live settings that cause objects to be
evicted after a specific period of time, schedules that trigger eviction at a specific day or time of day, and freshness guarantees that keep track of when objects were last
accessed and evict them if the time between accesses was
too great.
A sample cache reference configuration with a scheduled eviction policy is shown in figure 2. In this example,
a portion of the L2 cache is reserved for softly referencing
objects, leaving the rest for weak referencing. The most
commonly accessed weakly referenced objects will be
tenured and softly referenced.
The requirements of the application determine the size
of the soft component. An appropriate balance will keep
the objects that are used frequently but not always hard
referenced in the soft part of the cache, without allocating an excess amount of space for unreferenced objects.
The trade-off is that the cache will never cause the VM to
run out of memory, but if you end up spending too much
time on the fringe, then cache references may be repeatedly discarded.
By way of eviction policy, in the example in figure 2,
all instances of a particular domain class are scheduled to
be evicted each day at 3 a.m. This would allow the results
of an overnight batch-update process to be visible the following day, regardless of cache contents and usage.
CLUSTERED CACHES
Scaling a successful ORM-based application can be significantly more difficult than its initial development, because
frequently the application has not been architected a
priori to accommodate future scaling. It is usually a myth
that a functioning ORM application, running on a single
server, can be scaled up unchanged by simply procuring an entire cluster of servers and running on that. In a
typical ORM application the cache may be an important
reason for good application performance. When there is
the possibility of other processes updating the underlying
database, then the individual process caches must be considered, and the overall health of the combined clustered
caches must be taken into account.
The problem is that the likelihood of stale-data syn-
Cache Configuration with Eviction Policy
c ached instances of a given
t ype are evicted each morning at 3:00 a.m.
s oft
r eferences
w eak references
F IG 2