Theorem 4. 1. For all schema mappings M, the following are
equivalent:
1. M is definable by a finite set of LAV dependencies.
2. M is closed under target homomorphisms, admits universal solutions, allows for conjunctive query rewriting,
and is closed under union.
Proof. The implication ( 1) Þ ( 2) follows from Theorem
3. 4 and Proposition 3. 6. We now prove the implication
( 2) Þ ( 1). The idea behind the proof is that, since M is closed
under union, universal solutions for source instances I
can be constructed out of universal solutions for parts of
I. This implies that, in defining our schema mapping, we
only need to take into account of finite number of source
instances up to isomorphism, namely, those that contain
precisely one tuple. In what follows, we will make this idea
precise.
Suppose that M satisfies the listed conditions. Let
R1,..., Rn be the relations of the source schema, and let
D be a set consisting of k distinct values, with k = maxi £ n
arity(Ri). Let facts be the set of all possible facts, of the
form Ri (d1, . . . , dl ) with i £ n, l = arity(Ri), and d1, . . . , dl Î D.
For each a Î facts, let Ia be a the source instance containing only the fact a, and let Ja be a universal solution for Ia.
Let PosDiagIa (x) be the positive diagram of Ia, i.e., the conjunction of all facts true in I (which consists of precisely
one fact) and let PosDiagJa (x, y) be the positive diagram
of Ja where x are as many variables as there are elements of
adom(Ia) and y as many variables as there are elements
of adom( Ja) \ adom(Ia). Let fa be the following LAV dependency: "x(PosDiagIa(x) ® yPosDiagJa(x, y)). Finally, let
S = {fa|a Î facts}. We claim that S defines M.
First, we prove soundness: every (I, J) Î M satisfies S.
Suppose (I, J) Î M, and take any fa Î S. Furthermore, suppose that the antecedent of fa is satisfied in (I, J) under
some variable assignment h. In other words, h is a homomorphism from Ia to I. Let q be the certain answer query
of the conjunctive query yPosDiagJa(x, y). Since M allows
for conjunctive query rewriting, q is definable by a union
of conjunctive queries. Moreover, since yPosDiagJa(x, y)
is satisfied in Ja, which is a universal solution of Ia, it is
satisfied in all solutions of Ia . In other words, q is satisfied
in Ia, and hence, in I under the assignment h. Hence, (I, J)
satisfies fa.
Next, we prove completeness: every pair (I, J) satisfying S belongs to M. Suppose (I, J) satisfies S. Write I as
I = I1È.. . È In where each Ii contains only a single fact.
Then each (Ii, J) still satisfies S. Since Ii contains a single
fact, it must be isomorphic to Ia for some a Î facts. Using
the fact that (Ii, J) satisfies fa, we can show that there is a
homomorphism from a universal solution of Ii to J, constant on adom(Ii), hence, by closure under target homomorphisms, (Ii, J) Î M. It follows by closure under union
that (I, J) Î M. ❑
Our next result characterizes schema mappings specified
by GAV dependencies.
Theorem 4. 2. For all schema mappings M, the following are
equivalent:
1. M is definable by a finite set of GAV dependencies.
2. M is closed under target homomorphisms, admits universal solutions, allows for conjunctive query rewriting,
and is closed under target intersection.
Proof. (Hint) The implication ( 1) Þ ( 2) follows from
Theorem 3. 4 and Proposition 3. 6. For the implication ( 2)
Þ ( 1), we first show that every schema mapping M satisfying ( 2) is n-modular for some n > 0. For each target
relation R, let qR = certainM(Ry), where y is a sequence of
distinct fresh variables of appropriate length. Note that,
since M allows for conjunctive query rewriting, qR can
be written as a union of conjunctive queries. Now, let n
be the maximum of the number of variables occurring in
each qR. Using the hypothesis that M admits universal
solutions, is closed under target homomorphisms, and is
closed under target intersection, it can be shown that M
is n-modular.
After this, the implication ( 2) Þ ( 1) is established along
the same lines as the proof of Theorem 4. 1. Instead of
considering all source instances consisting of one tuple,
we consider all source instances I with |adom(I)| £ n.
There are only finitely many such source instances up to
isomorphism. Moreover, it can be shown, using closure
under intersection, that each has a null-free universal solution, and hence only full s-t tgds are needed to describe
them. ❑
We now focus on schema mappings specified by arbitrary
s-t tgds. As seen in Theorem 3. 4, every schema mapping
defined by a finite set of s-t tgds is closed under target homomorphisms, admits universal solutions, and allows for conjunctive query rewriting. The next result asserts that any
schema mapping satisfying these conditions is definable by
an infinite set of s-t tgds.
Proposition 4. 3. If a schema mapping M is closed under target homomorphisms, admits universal solutions, and allows for
conjunctive query rewriting, then M is definable by an infinite
set of s-t tgds.
Proof. (Hint) Assume that M satisfies the listed properties. Consider a source instance I and a target instance
J such that J is a universal solution for I with respect to
M. For each element of adom(I), introduce a distinct variable xi, and for each element of adom( J ) \ adom(I), introduce a distinct variable yj. Define PosDiagI(x) to be the
conjunction of all atomic formulas in x true in I (under
the chosen assignment) and define PosDiagI(x, y) likewise. Finally, let S be the set of all s-t tgds fI, J of the form
"x(PosDiagI(x) ® y PosDiagJ(x,y)), where I is a source
instance and J is a universal solution for I w.r.t. M. Using
an argument analogous to the one used in the proof of
Theorem 4. 1, it can be shown that S defines M. ❑
Can Proposition 4. 3 be strengthened to a characterization of schema mappings specified by a finite set of s-t tgds?
The next result, which was proved in Fagin et al.,
9 shows that
this is not possible.