privacy for databases, and much of it applies to data services. These topics are
simply too large4 to cover in the scope of
this article.
emerging Trends
In this article we have taken a broad
look at work in the area of data services. We looked first at the enterprise,
where we saw how data services can
provide a data-oriented encapsulation of data as services in enterprise
IT settings. We examined concepts, issues, and example products related to
service-enabling single data sources as
well as related to the creation of services that provide an integrated, service-oriented view of data drawn from multiple enterprise data sources. Given
that clouds are rapidly forming on the
IT horizon, both for Web companies
and for traditional enterprises, we also
looked at the emerging classes of data
services that are being offered for data
management in the cloud. As the latter
mature, we expect to see a convergence
of everything that we have looked at, as
it seems likely that rich data services of
the future will often be fronting data
residing in one or more data sources in
the cloud.
To wrap up, we briefly list a handful
of emerging trends that can possibly
direct future data services research and
development. Some of the trends listed
stem from existing problems, while
others are more predictive in nature.
We chose this list, which is necessarily
incomplete, based on the evolution of
data services we have witnessed while
slowly authoring this report over the
two last years. Again, while data services were initially conceived to solve
problems in the enterprise world, the
cloud is now making data services accessible to a much broader range of
consumers; new issues will surely arise
as a result.
Query formulation tools. Service-enabled data sources sometimes support
(or permit) only a restricted set of queries against their schemas. Users trying
to formulate a query over multiple such
sources can have difficulty determining how to compose such data services.
For a schema-based external model,
recent work proposed tools to help users author only answerable queries, for
example, CLIDE.
41 These tools utilize
the schemas and restrictions of the
A key aspect of
data services that
is underdeveloped
in current
product and
service offerings,
yet extremely
important,
is data security.
service-enabled data sources as a basis for query formulation, rather than
just the externally visible data service
metadata, to guide the users toward
formulating answerable queries. More
work is needed here to handle broader
classes of queries.
Data service query optimization. In
the case of integrated data services
with a functional external model, one
could imagine defining a set of semantic equivalence rules that would
allow a query processor to substitute
a data service call used in a query for
another service call in order to optimize the query execution time, thus
enabling semantic data service optimization. For example, the following
equivalence rule captures the semantic equivalence of the data services
getOrderHistory and getOpenOr-ders when a [status = ‘open’]
condition is applied to the former:
getOrderHistory(cid) [status =
‘open’] ≡ getOpenOrders(cid)
Work is needed here to help data service
architects to specify such rules and their
associated trade-offs “easily” and to
teach query optimizers to exploit them.
Very large functional models. For
data services using a functional external model, if the number of functions
is very large, it is difficult or even impossible for the data owner to explicitly enumerate all functions and for
the query developer to have a global
picture of them. Consider the example
of a data owner, who, for performance
reasons, only wants to allow queries
that use some non-empty subset of a
set of n filter predicates. Enumerating
all the 2n – 1 combinations as functions
would be tedious and impractical for
large n. Recent work has studied how
models consisting of such large collections of functions, where the function bodies are defined by XPath queries, can be compactly specified using
a grammar-like formalism40 and how
queries over the output schema of
such a service can be answered using
the model.
17 More work is needed here
to extend the formalism and the query
answering algorithms to larger classes
of queries and to support functions
that perform updates.
Cloud data service integration. Since
consumers and small businesses are