i nterview
writing queries in LINQ, they are what we call deferred. A
more popular way of saying that is that it uses lazy evaluation, so when you write your query, nothing happens.
When you send the query to the database and you get
back the results and iterate over them, then something
happens.
If you have side effects in your query, however, that’s
a problem. Suppose in your WHERE clause you have a side
effect or an exception is thrown. Now that side effect
happens at a very different point and place and time than
when you wrote your query. We all know that using side
effects inside queries is not a good idea, but since we now
have integrated these queries in a programming language,
it becomes quite easy for people to write side-effecting
code.
Suppose you want to open your file system, look at
all your directories, and write a query over your directory
structure, filtering out all PostScript files or Word files. In
.NET you have the disposable patterns. People would write
USING, open my file system, do a query
but since that query is deferred, by the time you’re iterating over the results, you have disposed of the object and
certain bad things will happen.
This combination of deferred execution and side
effects can definitely trip people up. But, again, SQL
people have known forever that you should not do side
effects in queries.
JB LINQ is a brand-new development in programming
languages. The idea that you can express your intent in
terms of higher-level query expressions is really new.
I believe that in bridging the gap between procedural
programming and set-oriented programming it’s going
to take a while for developers to move away from the
procedural aspects.
If we could actually try to minimize the number of
“for eaches” that are part of programs and think hard
about writing expressions that can be pushed as far as
they can to the database, that would be a way to write
better programs. Then, maybe even introducing compile checks in the compiler to prevent you from using
side-effecting expressions or functions in queries would
also be a step toward helping developers avoid tripping
themselves up.
Beyond LINQ and the primary language surface, mapping is complex. We need to figure out the best way to
explain the mapping without overwhelming the developer with massive query expressions that represent those
mappings, and we need to come up with a more graphi-
cal way of describing what the mapping is doing. This
may actually go a long way toward helping. Complexity
is one of the challenges we need to overcome.
TC With respect to the Entity Framework, what common
mistakes do you see people making? Erik mentioned the
side effects, and that’s really easy to see. You wag your
finger at people and you say, “Don’t do that!” Does the
Entity Framework have any of those kinds of common
traps or errors that people fall into?
JB When people see that they can start modeling their
concerns at a high level of abstraction, they sometimes
overuse concepts. Having deep inheritance becomes a
common mistake. It’s very hard for people to balance the
value of inheritance with its complexities. Coming up
with best practices on that will be a very good thing for
us to do.
TC As I mentioned, I’ve been using NHibernate in a .NET
environment for about four years, and I’m curious about
your responses to some interesting things I’ve observed
with regard to LINQ and the Entity Framework.
There’s a particular style of using NHibernate that I
think is a radical shift for programmers. The style that we
use is to isolate all changes to persistent objects within an
NHibernate session. We pull objects into that session, do
some operations on them, and then commit them back
out to the database.
The result is that most of the programmers who work
for me never see a lock, which is totally different from
the way that we used to program. Previously in a multithreaded, concurrent system, you would build your
objects and explicitly encode the locking in them. You
would decide when you were going to acquire the locks
and when you were going to release them, and there were
often complicated protocols to make sure that locks were
acquired in the same order. Now with NHibernate, there
aren’t any locks at all. There literally are no locks around
our objects because all of the locking and concurrency
control is essentially deferred down into the database.
I’m interested in whether you see this as a radical shift for
programmers.
EM Maybe what you’re observing is really the power
of optimistic concurrency and transactions, and the
database world has known transactions for a long time.
In some sense, transactions are a much easier way of
dealing with concurrency than locks because transactions
still allow you to think in a serial way. That’s the whole
ACID (atomicity, consistency, isolation, durability) thing.
You can think of your world as applying some big update
and it will either work or not, and if it works it will be
isolated, and so on.