Rick Strahl's Weblog  

Wind, waves, code and everything in between...
.NET • C# • Markdown • WPF • All Things Web
Contact   •   Articles   •   Products   •   Support   •   Advertise
Sponsored by:
West Wind WebSurge - Rest Client and Http Load Testing for Windows

Multiple Enumeration of LINQ results


:P
On this page:

Quick, how many times will this LINQ query hit the database:

Stopwatch sw = new Stopwatch();
sw.Start();
 
NorthwindDataContext context = new NorthwindDataContext();
context.ObjectTrackingEnabled = false ;
 
var result = from c in context.Customers
             select c; // new { CustomerID = c.CustomerID, Company = c.CompanyName };
 
for (int x = 1; x < 1000; x++ )
    foreach (Customer c in result)
    {                    
        string val = c.CustomerID;
    }
}
Response.Write("<hr>" + sw.Elapsed.Ticks.ToString("n") + " ticks");
 
return;

You sure?

LINQ queries are not fired when they are declared, but when they  enumerated over! Objects are created as they are enumerated. As a result the above actually fires 1000 queries against the database. So although you have a list of objects retrieved, physical and in hand the IEnumerable<T> is firing multiple times every time you get a new Enumerator in the loop.

So in essence LINQ results are forward only read once results. Any subsequent Enumeration results in re execution of the full query. The above example is frivolous but there are some scenarios where you might want to loop over lists more than once. For example if you do databinding on the retrieved data but before binding you decide you want to change values. Imagine this scenario:

NorthwindDataContext context = new NorthwindDataContext();
context.ObjectTrackingEnabled = false;

IEnumerable
<Customer> result = from c in context.Customers
             select c; // new { CustomerID = c.CustomerID, Company = c.CompanyName };
 
 
foreach (Customer c in result)
{
    // *** do some data modification
    c.CompanyName = "** " + c.CompanyName;                
}
 
this.grdData.DataSource = result;
this.grdData.DataBind();

Here there's some pre-processing of the query data before databinding (again - frivolous: you could have done that as part of the query, but you get the idea). What do you expect to see in the GridView display for Company? Probably not what you're getting, which is the raw company name from a second database query *without* the ** prefix. 

This is something that will take some getting used to. It's an easy mistake to make and one that you may not notice right away but that can have severe performance implications. It's even more problematic with results returned from a middle tier because in that case the data may be indirectly created with LINQ and you may not realize that IEnumerable<T> originated through LINQ.

Luckily there's an easy workaround for this: You can use the ToList, ToArray, ToDictionary methods to create a concrete list. So the following code gives the expected result:

NorthwindDataContext context = new NorthwindDataContext();
context.ObjectTrackingEnabled = false;
IEnumerable<Customer> result = from c in context.Customers
             select c; 
 
IList<Customer> CustList = result.ToList();            
foreach (Customer c in CustList)
{
    // *** do some data modification
    c.CompanyName  = "** " + c.CompanyName;                
}
 
this.grdData.DataSource = CustList;
this.grdData.DataBind();

So in effect you need to treat a raw LINQ IEnumerable resultset as a forward only set. Once you iterate over it again it fires the query again. This is true not just for LINQ to SQL but LINQ in general.

Once you know how this works this scenario is not a problem but it is a little unexpected to the unsuspecting consumer of a list...

Posted in LINQ  Visual Studio  

The Voices of Reason


 

Matt Brooks
August 16, 2007

# re: Multiple Enumeration of LINQ results

Rick,

My answer to your first question was 999. I think Scott Guthrie has blogged about 'consuming' LINQ queries in this way, in that the data source gets hit with the query everytime the 'result' is enumerated.

Matt

Josh Stodola
August 16, 2007

# re: Multiple Enumeration of LINQ results

Definitely gonna take some getting used to! I got this post bookmarked for when I start using LINQ. Thanks...

Burns
August 21, 2007

# re: Multiple Enumeration of LINQ results

I assume using something like:

var result = context.Select(o => new { o.ID, o.Name, o.Address })
.Single(o => o.ID == 1);

Hits the database straight away, and not when I do something like:

string name = result.Name;

???

Rick Strahl
August 21, 2007

# re: Multiple Enumeration of LINQ results

Yes. Single returns a fixed instance that is immediately available. The behavior here is the same as ToList() or the other methods that create concrete instances.

Egil
August 22, 2007

# re: Multiple Enumeration of LINQ results

Interesting. I was under the impression (after reading http://www.singingeels.com/Articles/Improving_Performance_With_LINQ.aspx), that the DataContext remembers previous result sets, and if the same query is being send to the same DataContext instance, it will simply return a cached result set, instead of hitting the database again.

dave^2=-1
February 01, 2008

# LINQ enumeration gotcha

Rick Strahl has posted about enumerating over LINQ results . Essentially, each item enumerated in a LINQ

Timothy Khouri
July 13, 2008

# re: Multiple Enumeration of LINQ results

Egil - Yes, you are correct that the DataContext remembers previous results, etc. Notice that the author chose specifically to disable that functionality:

context.ObjectTrackingEnabled = false;


This post is misleading as it makes it seem that in normal circumstances that you would hit some sort of performance issue (hitting the DB twice) if you did something such as databind two grids from the same LINQ to SQL query.

Also, this post is talking about "LINQ", but really it's specific to LINQ to SQL.

West Wind  © Rick Strahl, West Wind Technologies, 2005 - 2024