Java or C#) other than the native SQL.
The paper also addresses techniques
for solving large ILP problems using offline partitioning and approximation
techniques to break down the global ILP
instance into smaller ILP sub-problems.
However, while their offline partitioning
is a good physical design optimization
to have in the repertoire, its applicability also depends on the characteristics of
the production workload on the system.
Adding any new functionality to
a query language as rich as SQL has
complex trade-offs. Issues that influence such a decision are ease of specification of the new functionality in the
query, execution efficiency of the enriched query system, data movement,
and increased software complexity of
the database systems. Moreover, even
when a new functionality is incorporated, there is a question of whether the
core SQL should be enriched like other
examples in (c), as suggested by this paper, or if the functionality should be incorporated strictly via the extensibility
mechanisms. Specifically, in this case,
an alternative to extending SQL will be
to have a separate domain-specific language (potentially using a syntax like
that of package queries), interpreted by
the ILP solver runtime, and integrated
with the database system.
If you are interested in the topic of
constraint specification and optimization over data stored in databases, this
paper is sure to interest you. Also, it is
worth a read for anyone who wants to
consider adding extensions to SQL to
ease application tasks, as the authors
illustrate the key dimensions of what it
takes to add any new functionality to relational querying: language extension,
changes to the query execution engine,
and techniques to cope with scale.
Surajit Chaudhuri is a Distinguished Scientist at Microsoft
Research, Redmond, WA, USA.
Copyright held by author.
RELATIONAL QUERY LANGUAGES have
enabled the programmer to express
queries using a logical model of data
without any knowledge of the under-
lying physical structures. To help ap-
plications realize the benefits of such
declarative querying of data fully, there
has been much work along the follow-
ing three dimensions:
Language integrated query (LINQ) is an
elegant example of integration where
query expressions are introduced as a
first-class citizen in the programming
languages to avoid the above problem,
and a mapping tool (LINQ to SQL) trans-
lates language-integrated queries into
SQL for the database backend. More re-
cently, databases have been exposing a
REST API for the ease of mobile and web
b) Modern database systems pro-
vide extensibility so that applications
programmers are not limited to using
the built-in types and functions in SQL.
All major database systems support us-
er-defined functions that may be used
in selection, aggregation, or table ex-
pressions in a query. These user-defined
functions (potentially with parameters)
are written in native SQL or program-
ming languages for which the database
server provides runtime support. Such
extensibility mechanisms have been
used by database systems to add sup-
port for data types such as geospatial.
c) The SQL standard has added
new operators and constructs to make
declarative querying in relational languages more convenient or expressive,
for example, recursion, window functions, grouping sets, within group.
Despite the advances that have already taken place along these three
dimensions, there continues to be
proposals from time to time to further enrich functionality of relational databases to support important
classes of applications.
The following paper by Brucato et al.
is one such proposal for making relational databases do more. It makes a case
for marrying the well-established paradigms of constrained optimization (
specifically, ILP or integer linear programming) and traditional SQL querying.
The challenge of augmenting query
languages with the power of specifying constraints has been well studied
in the literature, both in the context of
database querying as well as logic programming. Earlier research has studied schemes for adding constraints on
individual rows (beyond simple selection) as well as aggregate constraints
that the set of answer rows to a query
must satisfy collectively. Introduction
of aggregate constraints makes query
evaluation especially challenging. The
paper demonstrates that when you add
an optimization criterion to a query
language with aggregate constraints
to choose among qualifying sets of answer sets, the query evaluation can be
accomplished by a combination of the
relational query execution engine and
an off-the-shelf ILP solver.
The authors explain how such queries may be specified declaratively (
referred to as package queries). These
package queries are evaluated by first
executing the traditional relational part
of the query and then mapping the constraint satisfaction and objective criterion as an instance of the ILP problem.
The extensibility features of the database system, as explained in (b), may
be used to add such an ILP solver to the
database systems just like the support
for user defined functions written in
programming languages (for example,
To Do or Not to Do:
Extending SQL with Integer
By Surajit Chaudhuri
To view the accompanying paper,