J
hom
J�
(a) Closure under target
homomorphisms
Source schema
I
J1 J2
J1 Ç J2
(b) Closure under target
intersection
Target schema
I1
I2
J1
J2
(c) Closure under union
certain answers query certainM(q) is definable by a union
of conjunctive queries over the source schema. In other
words, there is a union q¢ of conjunctive queries over the
source schema such that certainM(q)(I) = q¢(I), for every
source instance I.
The first two conditions of closure under target homomorphisms and admitting universal solutions go very well
together. As was observed in Fagin et al.,
7 if a schema mapping is closed under target homomorphisms and admits
universal solutions, then, for every source instance I, the
(typically infinite) space of all solutions of I w.r.t. M can
be completely described by just a single target instance J,
namely, by any universal solution J for I. This is so because
if J is universal for I and M is closed under target homomorphisms, then for every target instance J¢, we have that
J¢ is a solution for I if and only if there is a homomorphism
h: J ® J¢ that is constant on adom(I). We mention in passing
that these two conditions together also imply the existence
of core universal solutions, which are the smallest universal solutions (see Fagin et al.
8). Thus, these two conditions
lie at the foundation of data exchange. The third condition
of allowing for conjunctive query rewriting is important in
the context of data integration, since it implies that the certain answers of unions of conjunctive queries over the target are computable in polynomial time (in the sense of data
complexity).
It is well known that all three conditions of closure
under target homomorphisms, admitting universal
solutions, and allowing for conjunctive query rewriting
are possessed by every schema mapping M definable by
a finite set of s-t tgds. Closure under homomorphisms
follows easily from the definitions; admitting universal solutions was shown in Fagin et al.
7 using the chase
procedure. In the case of GAV dependencies, a union of
conjunctive queries over the target is easily transformed
to a union of conjunctive queries over the source by simply replacing each target relation symbol P by a union of
conjunctive queries over the source that defines P. In the
case of arbitrary s-t tgds, allowing for conjunctive query
rewriting is proved by first “decomposing” the given s-t
tgds into GAV dependencies and to LAV dependencies,
and then applying results from Abiteboul and Duschka1
and Duschka and Genesereth.
6 We collect these results
into one theorem.
Theorem 3. 4. Every schema mapping definable by a
finite set of s-t tgds is closed under target homomorphisms,
admits universal solutions, and allows for conjunctive query
rewriting.
Next, we define three additional properties of schema
mappings.
Definition 3. 5. Let M be a schema mapping.
• Closure Under target intersection: We say that M is
closed under target intersection if for all source instances
I and all target instances J1, J2 if (I, J1) Î M and
(I, J2) Î M, then also (I, J1 Ç J2) Î M. (See Figure 5b.)
• Closure Under Union: We say that M is closed under
union if (0/, 0/) Î M, and for all (I1, J1) Î M, and (I2, J2)
Î M, we have that also (I1È I2, J1ÈJ2) Î M. (See
Figure 5c.)
• n-Modularity: Let n be a positive integer. We say that M is
n-modular if whenever a pair (I, J) does not belong to M,
there is a sub-instance I ' Í I such that |adom (I¢)| £ n and
(I¢, J) does not belong to M.
Intuitively, a schema mapping is closed under union if
solutions can be constructed in a “modular” fashion, i.e., on
a tuple-by-tuple basis. Similarly, n-modularity asserts that if
(I, J) Ï M, then there is a concise explanation for this fact;
this property can also be viewed as a relaxation of closure
under union.
We now give several useful propositions about the properties we just introduced.
Proposition 3. 6. Let M be a schema mapping.
• If M is definable by a finite set of GAV dependencies, then
M is closed under target intersection.
• If M is definable by a finite set of LAV dependencies, then
M is closed under union.
• If M is definable by a finite set of s-t tgds, then M is
n-modular for some n ≥ 1.
Proof. The first two parts follow easily from the definitions. For the third part, assume that M is a schema
mapping definable by a finite set S of s-t tgds. Let n be the
maximum number of variables occurring in the left-hand
side of the s-t tgds in S. We claim that M is n-modular.
Assume that (I, J) Ï M. Then there is some s-t tgd "x(f(x) ®
$yy (x, y) ) from S and a tuple a of values from adom(I) such
that (I, J) |= f(a) Ù Ø$yy(a, y). Now, let I¢ be the sub-instance
of I containing only the values a. Then, it is still the case that
(I¢, J) |= f (a) Ù Ø$yy(a, y), and hence (I¢, J) Ï M. ❑
4. lAnGuAGe cHARAc TeRizATions
This section contains the main technical results of the
paper, which yield structural characterizations of the various schema-mapping languages considered in earlier sections. We begin with schema mappings specified by LAV