tion and flow-control services provided
by these products, message-translation
services—which constitute another
form of information integration—are
also needed.
A typical message-translation scenario in e-commerce enables a small
vendor (say, Pico) to offer its products
through a large retail Web site (
Goliath). When a customer buys one of Pico’s products from Goliath, it sends an
order message to Pico, which then has
to translate that message into the format required by its order-processing
system. A message-mapping tool can
help Pico meet this challenge. Such a
tool offers a graphical interface to define translation functions, which are
then compiled into a program to perform the message translation. Similar
mapping tools are used to help relate
the schemas of the source databases
to the target schema for ETL and EII
and to generate the programs needed
for data translation.
Object-to-Relational Mappers.
Application programs today are typically
written in an object-oriented language,
but the data they access is usually
stored in a relational database. While
mapping applications to databases
requires integration of the relational
and application schemas, differences
in schema constructs can make the
mapping rather complicated. For example, there are many ways to map
classes that are related by inheritance
into relational tables. To simplify the
problem, an object-to-relational map-per offers a high-level language in
which to define mappings. 23 The resulting mappings are then compiled
into programs that translate queries
and updates over the object-oriented
interface into queries and updates on
the relational database.
Document Management. Much of the
information in an enterprise is contained in documents, such as text files,
spreadsheets, and slide shows that
contain interrelated information relevant to critical business functions—
product designs, marketing plans,
pricing, and development schedules,
for example. To promote collaboration and avoid duplicated work in a
large organization, this information
needs to be integrated and published.
Integration may simply involve making the documents available on a single Web page (such as a portal) or in
a content-management system, possibly augmented with per-document
annotations (on author and status, for
example). Or integration may mean
combining information from these
documents into a new document, such
as a financial analysis.
Whether or not the documents are
collected in one store, they can be indexed to enable keyword search across
the enterprise. In some applications,
it is useful to extract structured information from documents, such as cus-
tomer name and address from email
messages received by the customer-support team. The ability to extract
structured information of this kind
may also allow businesses to integrate
unstructured documents with preexisting structured data. In the example
above, the auto manufacturer wanted
to link transactional information
about purchases with emails about
these purchases in order to enable better analysis of problem reports.
Portal Management. One way to integrate related information is simply
to present it all, side-by-side, on the
same screen. A portal is an entire Web
site built with this type of integration
in mind. For example, the home page
of a financial services Web site typically presents market prices, business
news, and analyses of recent trends.
The person viewing it does the actual
integration of the information.
Portal design requires a mixture
of content management (to deal with
documents and databases) and user-interaction technology (to present the
information in useful and attractive
ways). Sometimes these technologies
are packaged together into a product
for portal design. 11 But often they are
selected piecemeal, based on the required functionality of the portal and
the taste and experience of the developers who assemble it.
figure 2. screenshot of a mapping tool.
core technologies
Extensible Markup Language (XML). In
any of the scenarios noted here, an
integrated view of data from multiple
sources must be created. Often any
one of the sources will be incomplete
with respect to that view, with each
source missing some information that
the others provide. In our example, the
emails are unlikely to provide detailed
information about the dealerships,
while the relational data might not
have the problem reports. In XML, a
semi-structured format, each data element is tagged so that only elements
whose values are known need to be
included. This ability to handle variations in information content is driving
EII systems to experiment with XML. 22
This flexibility makes XML an interesting format for integrating information across systems with differing representations of data. In some
integration scenarios, it may not be