Object-Relational Mapping
approach, however, can produce anomalous application behavior, unexpected results, or outright bugs. User forums are littered with evidence of developers suffering the consequences of such failures of understanding.
Caching can be one of the most technologically advanced components of an ORM implementation, thus representing a critical balance point for any application that uses the implementation. Failure to acknowledge it as a potential fulcrum may result in an application teetering or falling on the side of poor performance and incorrect semantics. In this article, therefore, we discuss topics relevant to caching in ORM systems, and we expose some of the details that implementations must be concerned with and that application developers should be aware of.
First and foremost developers must acknowledge the nature of objects and how they are used in object-oriented languages. In practice, very rarely does an object exist in isolation from other objects. An application reference to an object is really an indirect reference to an entire graph of objects rather than to a single solitary object. The consequences of such a realization are far-reaching and form the basis for many of the difficulties associated with caching in ORM.
When a read operation is performed, it must be considered by runtime that the process may also fault in objects referenced by the asked-for object. Of course, this sequence may continue recursively, causing a whole multitude of objects to be read from the database, each individually requested as needed and in succession (a phenomenon dubbed ripple loading2). Developers can prevent this from happening through one of many backstop measures, such as declaring, either statically or dynamically, whether specific relationships should be traversed and loaded. There are other approaches to avoiding multiple successive trips to the database, but a discussion of these is outside the scope of this article.
An object graph, by definition, implies that there may be multiple paths leading to the same object. In some cases these multiple relationships may be from a single object, but in most cases they are from different objects. In the course of loading the object graph, these relationships must end up pointing to the same identical object, not two distinct memory imprints that happen to have the same state. Failure to maintain object identity will lead to the persistent state of the object being duplicated in multiple instances, each one containing a point-in-time view of the entity state. This will inevitably lead to inconsistent state and incorrect program behavior.
Maintaining the identity of objects in a graph means that the loader must keep track of each object and its identity. The nature of the solution meshes neatly with the job that a cache already has to do, so it is not surprising that the task is often relegated to the cache.
An application manages different visibility scopes during its execution. For single-user scopes an isolated cache is appropriate, but for global contexts a shared cache, sometimes referred to as an L2 (level 2) cache, provides the level of caching that offers the same state to all requesters. Each of these is unique to its purpose and may function or perform slightly differently from the other. There may even be duplication of state spanning the two caches, particularly in light of isolation requirements.
TRANSACTIONAL CACHE
Transactions clearly play a major role in any system, including the cache. In fact, the transactional cache is purposed especially for the transaction, and its inhabitants are strictly transactional objects. Being associated with the transaction implies that the cache exports the correct isolation and consistency of its objects (the “correct” isolation is described in more detail in a later section). Assumptions about the type of transaction are particularly relevant because of the differences among them. Some are thread-bound, while others allow multithreading; some are tied to a single database connection, while others may access multiple resources.
The presence of an object in the transactional cache means, by definition, that it is transactional. There is an if-and-only-if relationship between the two, such that when a transactional object is modified, its modified state must be reflected within the transactional cache. Furthermore, the state of the transactional cache represents the total change summary of the transaction from the ORM perspective and must of necessity follow the life cycle
References:
Archives