Let's take a look at their usage in, likely, the simplest case:
var query =
from o in Query<Order>.All
where
(new {"Alex", "Alexey", "Dmitry"})
.Contains(o.Customer.FirstName)
select o;
As you suspect, such a query will be translated to SQL with IN operator.But what about this one:
var query =
from o in Query<Order>.All
where
(from n in Enumerable.Range(1, 100000)
select n.ToString())
.Contains(o.Customer.FirstName)
select o;
What SQL would you expect to see behind the scenes in this case? Will it work at all?Ok, now I'm going to astonish you completely:
var bestOrdersQuery =
from c in Query<Order>.All
let price = o.Details.Sum(o => o.UnitPrice * o.Quantity)
order by -totalPrice
select new {
Order = o,
Price = price
};
best10KOrders = bestOrdersQuery.Take(10000).ToList();
var query =
from bo in Query.Store(best10KOrders) // Note that this is List<anonymoustype>!
from c in Query<Customer>.All
where c.Orders.Contains(bo.Order)
select new {Customer = c, Price = bo.Price} into pair
group pair by pair.Customer into g
select new { Customer = g.Key, PaidInBest10KOrders = g.Sum(i => i.Price) }
This query will work as well.All I wrote implies our query engine is able to:
- Store almost any collection into a temporary table before query execution
- Maintain mapping for its items: you can use any properties of items in such collections inside LINQ queries
- Use IN in SQL instead of temporary table when this is possible. This significantly depends on provider. E.g. SQL Server does not support tuples in IN, but PostgreSQL does, so in case with SQL Server we'll be able to represent only collections of primitive types or keys by this way. There are other limitations we're going to consider, e.g. maximal parameter count and maximal comparison operations count per query.
var bestOrdersQuery =
from c in Query<Order>.All
let price = o.Details.Sum(o => o.UnitPrice * o.Quantity)
order by -totalPrice
select new {
Order = o,
Price = price
};
best10KPairs =
bestOrdersQuery.Take(10000).ToList()
.Select(i => new Pair<Order, double>(i.Order, i.Price).ToList();
var query =
from p in Query.Store(best10KPairs) // Note that we store Pair<Order, double> here!
select p; // Will not work, since actually we don't know how to materialize Pair<Order, double>
In this case we simply don't know how to materialize Pair&;ltOrder, doublevar query = from p in Query.Store(best10KPairs) // Note that we store Pair<Order, double> here! select new Pair<Order, double>(p.First, p.Second);Why I think such complex implementation of this feature is really necessary and attractive?
- Think about upcoming integration with full-text search engines. Results they return can be very large, and ideally we must be able to pass them to further processing on RDBMS in any reasonable case. I'm not sure if you know this or not, but v3.9 was able to process up to 1K results returned by Lucene.Net in case it was used per each query.
- Prefetch is one more nice application of this feature, although IN optimization is more desirable here than a version with temporary table.
- Finally, if there will be executable DML queries some day, this feature might help a lot here as well.
- Likely, initially there will be some minor lacks. E.g. IN optimization might not work in v4.1. I simply not sure if we'll be able to complete this.
- But we'll resolve them in v4.1.1 ;)
No comments:
Post a Comment