data caches are distributed to the caching machines. The changes may be
new versions made by batch updates or
incremental updates.
4. The front-end apps read the reference-data caches. These are gradually
updated, and the users of the front end
see new information.
The reference-data cache is a key-value store. One easy-to-understand
model for these caches has partitioned and replicated data in the
cache. Each cache machine typically
has an in-memory store (since disk
access is too slow). The number of
partitions increases as the size of
the data being cached increases. The
number of replicas increases initially
to ensure fault tolerance and then to
support increases in read traffic from
the front end.
It is possible to support this pattern
in the plumbing with a full concierge
service. The plumbing on the back
end can handle the partition for data
scale (and repartitioning for growth or
shrinkage). It can handle the firing up
of new replicas for read-rate scale. Also,
the plumbing can manage the distribution of the changes made by the back
end (either as a batch or incrementally). This distribution understands partitioning, dynamic repartitioning, and
the number of replicas dynamically assigned to partitions.
Figure 5 illustrates how the interface from the back end to the front
end in a SaaS application is typically a
key-value cache that is stuffed by the
back end and read-only by the front
end. This clear pattern allows for the
figure 6. silos and soa.
creation of a concierge service in a
PaaS system, which eases the implementation and deployment of these
applications.
Note that this is not the only
scheme for dynamic management of
caches. Consistent hashing (such as
implemented by Dynamo, 4
Cassandra6, and Riak8) provides an excellent
option when dealing with reference
data. The consistency semantics of the
somewhat stale reference data, which
is read-only by the front end and updated by the back end, are a very good
match. These systems have excellent
self-managing characteristics.
Styles of back-end processing. The
back-end portion of the SaaS app may
be implemented in a number of different ways, largely dependent on the scale
of processing required. These include:
Relational database and normal
app. In this case, the computational
approach is reasonably traditional.
The data is held in a relational database, and the computation is done
in a tried-and-true fashion. You may
see database triggers, N-tier apps, or
other application forms. Typically
in a cloud environment, the N-tier
or other form of application will run
in a VM. This can produce the reference data needed for the front end,
as well as what-if business analytics.
This approach has the advantage of a
relational database but scales to only
a few large machines.
Big data and MapReduce. This ap-
proach is a set-oriented massively par-
allel processing solution. The under-
lying data is typically stored in a GFS
(Google File System) or HDFS (Hadoop
Distributed File System), and the com-
putation is performed by large batch
jobs using MapReduce, Hadoop, or
some similar technology. Increasingly,
higher-level languages declaratively
express the needed computation. This
can be used to produce reference data
and/or to perform what-if business
analytics. Over time, we will see MapRe-
duce/Hadoop over all of an enter-
prise’s data.
C:00 M:00 Y:00 K: 35
C:00 M:00 Y:00 K: 100
C:30M:00Y:100K:00 C:00M:85Y:90K:00 C:90M:00Y:00K:05
C:20M:00 Y:100K: 33 C:00 M: 100 Y: 100
K: 33
C: 90 M:00 Y:00 K: 40
C:30M:00Y:100K:00 C:00M:00Y:00K: 35 C:00M:85Y:90K:00
C:20M:00Y:100K: 33 C:00M:00 Y:00K: 100 C:00 M: 100 Y: 100
K: 33
C: 90 M:00 Y:00 K:05
C: 90 M:00 Y:00 K: 40
C:00 M:00 Y:00 K: 35
C:00 M:00 Y:00 K: 100
C:30M:00Y:100K:00 C:00M:85Y:90K:00 C:90M:00Y:00K:05
C:20M:00Y:100K: 33 C:00 M: 100 Y: 100
K: 33
C: 90 M:00 Y:00 K: 40
C:30M:00Y:100K:00 C:00M:00Y:00K: 35 C:00M:85Y:90K:00
C:20M:00Y:100K: 33 C:00M:00 Y:00K: 100 C:00 M: 100 Y: 100
K: 33
App- 2
App- 1
App- 3
App- 4
App- 5
App- 6
App- 7
App- 8
App- 9
App- 10
DB-A
DB-B
DB-C
DB-D
Silo
Silo
Silo
Silo
SoA Service Bus: Messaging and data Feeds
jAnuARY 2013 | voL. 56 | no. 1 | CoMMuNiCatioNs of tHe aCM 57