Lexical Closures, Deferred Execution and Kicker Methods

Published 06 November 07 05:20 AM | andersnoras 

Casper wrote an interesting comment to my screen cast about LINQ and Quaere auto completion which deserves an answer in the form of a blog post. As the title suggests you’re in for a ride through many different concepts in .NET, Java, LINQ and Quaere amongst other things. Fasten your seat belts!

For those of you who didn’t bother to read Casper’s comment, I’ll repeat the essence of it here:

One very big limitation of your implementation (yes I know, it is not really LINQ for Java) is that it does not actually defer execution (represent the expression until data is actually needed) which is really where the power of LINQ comes in - avoiding a bunch of temporary transient lumps of data.

Deferred execution

Let’s begin with the easy part, Quaere actually has deferred execution, but its not quite the same as with LINQ. For those of you that haven’t got the faintest idea of what “deferred execution” is let’s look at some examples.

The most common use case for LINQ to Objects is to write a query like:

int[] numbers = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
var nPlusOne = from n in numbers
select n + 1;

And then, at a later stage, enumerate the query result using a foreach loop or similar. Like this:

foreach (var n in nPlusOne)
{
Console.WriteLine(n);
}

Just like much of the LINQ stuff, foreach is just syntactic sugar for something similar to this…*

IEnumerator<int> enumerator = nPlusOne.GetEnumerator();
while (enumerator.MoveNext())
{
Console.WriteLine(enumerator.Current);
}

In LINQ the standard query operators aren’t invoked before a result is iterated, essentially when the MoveNext() method is invoked. Query execution is then suspended until the next iteration. Using Chris Wanstrath’s Ambition (which is a Ruby library similar to both LINQ and Quaere) terminology,the enumeration is a “kicker method” in LINQ. Quaere has a similar implementation. Quaere uses the iterator() method on Java’s Iterable interface as its “kicker method”. The design choice behind this was to be able to leverage casting to execute a query in a similar fashion as I used implicit cast operators in my event planning API. A nice side effect of this was that Quaere got deferred execution as well. When you write a query like this…

Integer[] numbers = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
Iterable<Integer> nPlusOne =
from("n").in(numbers).
select("n + 1");
// ...the query isn't executed before the execution
// pointer reaches the next line.
for (Integer n: nPlusOne) {
System.out.println(n);
}

Lexical Closures

However, there is a big difference between the behavior one can expect with deferred execution in LINQ and Quaere. Consider the following LINQ query…

int[] numbers = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
int i = 1;
var numbersPlusI = from n in numbers
select n + i;
i++;
foreach (var n in numbersPlusI)
{
Console.WriteLine(n);
}

When you run this sample, the numbers from two to eleven will be printed, the reason is that the local variable i (which is used in the query) is incremented from one to two between the query declaration and execution. If we write the same query in Java…

Integer[] numbers = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
int i = 1;
Iterable<Integer> nPlusI =
from("n").in(numbers).
select(plus("n",i));
// Notice that we're using the explicit 'plus' factory
// rather than expression parsing to reference the 'i'
// variable.
i++;
for (Integer n: nPlusI) {
System.out.println(n);
}

…the numbers from one to ten will be printed. Why is this? Remember that .NET has got closures?

public delegate void Action();
// ...
int i = 1;
Action add = delegate
{
Console.WriteLine(5 + i);
};
i = 2;
add();

The add closure in the above example is bound to the variable i and since the closure isn’t invoked until the add() delegate is called, the number seven will be printed. This is roughly the same thing that happens with a LINQ query. The select n + i part of the query is the same as:

numbers.Select(n => n +i)

..which again is the same as…

Enumerables.Select(
numbers,
delegate(Integer n)
{
return n + i;
}
);

For more background on how queries are translated from C# 3.0 syntactic sugar to plain old C#, read my Bare Naked LINQ post.

Unlike .NET, Java doesn’t have closures (yet) so we cannot do the same thing there. In the previous Quaere example, the variable i was added as a Constant to the query expression tree. When we incremented i, only the local variable was changed and nothing happened to the expression tree. As you might recall, Java’s int is a primitive while the Integer class is a class wrapper around an int primitive. Integer only boxes the int so we cannot use it as a reference type. However, if we create our own wrapper class…

public class RefInteger implements Convertible, Comparable {
private int seed;
public RefInteger(int seed) {
this.seed = seed;
}

public Number toNumber() {
return seed;
}
// ...toWhateverElse() eluded for brewety..

public void inc(Number amount) {
seed+=Convert.toInteger(amount);
}
public int compareTo(Object o) {
return Comparer.compare(this,o);
}
}

…and swap our Integer for this class, we’ll have a number that is a reference type and not a primitive.

Integer[] numbers = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
RefInteger i = new RefInteger(1);
Iterable<Integer> nPlusI =
from("n").in(numbers).
select(plus("n",i));
i.inc(1);
for (Integer n: nPlusI) {
System.out.println(n);
}

Now, the numbers from two to eleven are printed. The reason is that a reference to i is added to the expression tree rather than the value of i. When the value of i is changed by calling RefInteger’s inc method the reference stays the same. Quaere is able to add i with n because RefInteger implements the Convertible interface which can be used by Quaere to get a Number primitive.

Conclusion

Both LINQ and Quaere have deferred execution of queries, but because of differences in the implementation they don’t behave the same way. You can achieve LINQ-like behavior in Quaere, but it is a little cumbersome compared with the elegance of .NET’s lexical closures. This might not be as bad as it sounds, as some people been surprised by LINQ’s behavior.

* Geeky fact: Types aren’t required to implement the IEnumerable interface to be usable with foreach. All that is required is that the GetEnumerator() is present, and that this method returns a type with a MoveNext() method and Current property.

Filed under: , , ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

# Casper said on November 6, 2007 6:43 PM:

Thanks for taking the time to explaining the difference, this is pure gold - rare to find insightful stuff about Java and .NET in the same blog! I sincerely hope you will keep pushing Quaere - perhaps even some Quaere-to-database stuff. (One of the annoying things about i.e. JPA is that it is not type safe and you do not get any kind of code completion. But to explore that, I guess you really need type inference in Java.)

# Better Software Development said on December 2, 2007 5:54 PM:

While reviewing Martin Fowlers upcoming DSL book I started thinking (again) about closures in Java and the current overly verbose syntax for Single Abstract Method Interfaces. And then there was the blog entry of Anders Norås about Deferred Execution

# dave^2=-1 said on February 1, 2008 2:22 AM:

Reading Anders' post on Lexical Closures, Deferred Execution and Kicker Methods with respect to LINQ

Leave a Comment

(required) 
(optional)
(required) 
Enter the code you see below