Article development led by
queue.acm.org
Big data is about more than size, and LINQ
is more than up to the task.
BY eRik meiJeR
the World
according
to LinQ
PRoGRAMMERS BUILDING WEB- and cloud-based
applications wire together data from many different
sources such as sensors, social networks, user
interfaces, spreadsheets, and stock tickers. Most of
this data does not fit in the closed and clean world
of traditional relational databases. It is too big,
unstructured, denormalized, and streaming in real
time. Presenting a unified programming model across
all these disparate data models and
query languages seems impossible
at first. By focusing on the common-alities instead of the differences, however, most data sources will accept
some form of computation to filter and
transform collections of data.
Mathematicians long ago observed
similarities between seemingly differ-
ent mathematical structures and for-
malized this insight via category the-
ory, specifically the notion of monads
as a generalization of collections. Lan-
guages such as Haskell, Scala, Python,
and even future versions of JavaScript
have incorporated list and monad
comprehensions to deal with side ef-
fects and computations over collec-
tions. The .NET languages of Visual
Basic and C# adopted monads in the
form of LINQ (Language-integrated
Query) as a way to bridge the gap be-
tween the worlds of objects and data.
This article describes monads and
LINQ as a generalization of the rela-
tional algebra and SQL used with arbi-
trary collections of arbitrary types, and
explains why this makes LINQ a com-
pelling basis for big data.