Object-Relational
Mapping
Exposing
the
ORM
Cache
cache as an object cache, in fact implementations are
fairly diverse and vary in the way they store, access, and
update the data contained in them. These distinctions,
and the various configurations that are associated with
each of them, may have different effects on performance.
OBJECT CACHE
In an object-oriented environment, the choice of what
to cache tends toward the most intuitive format—that of
the domain object itself. This is further supported by the
realization that domain objects are what will be returned
to the user eventually, anyway, and that caching in an
intermediate form may introduce additional overhead
each time the object must be constructed.
Transactional caches have a tendency to be exclusively
object caches. Storing the objects in their native domain
form is the most efficient way for the in-transaction
operations to function, allowing for simple relationship
traversal.
The cost of caching domain objects is that the objects
must be built and preloaded with the object state at
the point of reading from the database. When caching
objects, there is not typically any other kind of caching, so retrieved data must be stored as part of the object
aggregate. It works both ways, of course, as the benefits of
refreshing or returning read-only data become more pronounced because the objects are prebuilt, thus avoiding
the cost of rebuilding.
DATA CACHE
If object caching is at one end of the caching spectrum,
then data caching is at the other. Caching at the data
level means simply that the raw compositional state of
each object is stored separately in the cache without an
encapsulating object. Simple data fragments are easily
manipulated and stored, with little or no accrued costs
owing to object management and relationships.
its raw or primitive form is that it is closer to the kind of
data that is being transferred to and from the basic database connectivity layer. This provides a simpler interface
for exchange and renders the cache more pluggable.
The performance cost of caching data is that every successful request requires at least one—and usually more—
object construction. The newly constructed objects are
then hydrated from the cached data and returned to the
ORM manager.
QUERIES AND CACHING
The primary motive for ORM caching is to increase performance through localized data access as an alternative
to making a database round trip to retrieve it. The initial
operation is always going to be the execution of a find or
query call to obtain the entity or set of resulting objects;
thus, caching and the queries that request objects are
closely connected.
An ORM product is presumed to be on fairly familiar
terms with the database it communicates with. The ORM
system is not, itself, a database, however, and is not normally expected to perform queries in memory, although
some do indeed support a subset of that functionality
(sometimes referred to as in-memory querying). If the
query criteria are based upon one or more primary-key
values, or the keys against which the cached entities are
stored, then the query can be satisfied by the in-memory
cache. This is the optimal query-processing scenario since
it avoids having to make a database round trip.
If the search criteria rely upon non-key fields, then
normally the query must be executed against the database
to obtain the set of result identifiers. That set can then be
used to obtain the set of entities from the cache.
The trade-offs can be more clearly evaluated at this
point. The data obtained from the database can be the
complete set of entity data, or it can be just the identifiers. On the one hand, if the entity for a given identifier
turns out not to be cache-resident, then an additional trip
to the database must be undertaken to obtain the missing entity data. On the other hand, if the entity data is
pulled from the database and the entity did in fact reside
in cache, then the carrying cost of the retrieved data was
apparently wasted. It turns out that even if the entity
were cached, its contents could have become stale since
the time it was loaded. In this situation the returned data
can be used to refresh the cached copy with the fresh data
from the database.
There is an additional mitigating factor to retrieving
the entity state from the database: if the record is not
large, then the cost of getting the entire record is only