Show Blogger Panel Hide Blogger Panel
Alex Yakunin

November 30, 2009

DataObjects.Net v4.1 release: 1 day delay

Now it's obvious I'm unable to finish my part of work in time - there are still many TODOs and parts to review in Manual. So now I'm going to sleep, and will continue my work a bit later today.

On the other hand... I'm glad to see what appears there. About 500Kb of HTML in Manual is actually even a bit more than I expected to see. There are lots of examples!

November 29, 2009

DataObjects.Net v4.1 features: nested transactions + cheat sheet

I just added the following example to manual:
// Building the domain
var domain = BuildDomain();

Key dmitriKey;
int dmitriId;
string dmitriKeyString;

// Opening Session
using (Session.Open(domain)) {

  // Opening transaction
  using (var transactionScope = Transaction.Open()) {

    // Creating user
    var dmitri = new User {
      Name = "Dmitri"
    // Storing entity key
    dmitriKey = dmitri.Key;
    dmitriKeyString = dmitriKey.Format();
    dmitriId = dmitri.Id;

    Console.WriteLine("Dmitri's Key (human readable): {0}", dmitriKey);
    Console.WriteLine("Dmitri's Key (serializable): {0}", dmitriKeyString);
    Console.WriteLine("Dmitri's Id: {0}", dmitriId);

    // Marking the transaction scope as completed to commit it 

  // Opening another transaction
  using (var transactionScope = Transaction.Open()) {
    // Parses the serialized key
    var anotherDimtriKey = Key.Parse(Domain.Current, dmitriKeyString);
    // Keys are equal
    Assert.AreEqual(dmitriKey, anotherDimtriKey);

    // Materialization on fetch
    var dmitri = Query<User>.Single(dmitriKey);
    // Alternative way to do the same
    var anotherDmitri = Query<User>.SingleOrDefault(dmitriKey);
    Assert.AreSame(dmitri, anotherDmitri);
    // Fetching by key value(s)
    anotherDmitri = Query<User>.Single(dmitriId);
    Assert.AreSame(dmitri, anotherDmitri);

    // Modifying the entity
    dmitri.Name = "Dmitri Maximov";

    // Opening new nested transaction
    using (var nestedScope = Transaction.Open(TransactionOpenMode.New)) {
      // Removing the entity
      AssertEx.Throws<InvalidOperationException>(() => {
        var dmitryName = dmitri.Name;
      // No nestedScope.Complete(), so nested transaction will be rolled back

    // Transparent Entity state update
    Assert.AreEqual("Dmitri Maximov", dmitri.Name);
    // Creating few more objects
    var xtensiveWebPage = new WebPage {
      Title = "Xtensive Web Site", 
      Url = ""
    var alexYakuninBlogPage = new WebPage {
      Title = "Alex Yakunin's Blog", 
      Url = ""
    var subsonicPage = new WebPage {
      Title = "SubSonic project page", 
      Url = ""

    // Adding the items to EntitySet

    // Removing the item from EntitySet

    // Getting count of items in EntitySet
    Console.WriteLine("Dmitri's favorite page count: {0}", dmitri.FavoritePages.Count);
    Assert.AreEqual(2, dmitri.FavoritePages.Count);
    Assert.AreEqual(2, dmitri.FavoritePages.Count()); // The same, but by LINQ query

    // Enumerating EntitySet
    foreach (var page in dmitri.FavoritePages)
      Console.WriteLine("Dmitri's favorite page: {0} ({1})", page.Title, page.Url);

    // Checking for the containment

    // Opening new nested transaction
    using (var nestedScope = Transaction.Open(TransactionOpenMode.New)) {
      // Clearing the EntitySet
      Assert.AreEqual(0, dmitri.FavoritePages.Count);
      Assert.AreEqual(0, dmitri.FavoritePages.Count()); // By query
      // No nestedScope.Complete(), so nested transaction will be rolled back

    // Transparent EntitySet state update
    Assert.AreEqual(2, dmitri.FavoritePages.Count);
    Assert.AreEqual(2, dmitri.FavoritePages.Count()); // The same, but by LINQ query

    // Finally, let's query the EntitySet:
    // Query construction
    var dmitryFavoriteBlogs =
      from page in dmitri.FavoritePages
      where page.Url.ToLower().Contains("blog")
      select page;
    // Query execution
    var dmitryFavoriteBlogList = dmitryFavoriteBlogs.ToList();

    // Printing the results
    Console.WriteLine("Dmitri's favorite blog count: {0}", dmitryFavoriteBlogList.Count);
    foreach (var page in dmitryFavoriteBlogList)
      Console.WriteLine("Dmitri's favorite blog: {0} ({1})", page.Title, page.Url);

    Assert.AreEqual(1, dmitryFavoriteBlogList.Count);
    // Marking the transaction scope as completed to commit it 
Its output:
Dmitri's Key (human readable): User, (1)
Dmitri's Key (serializable): Xtensive.Storage.Manual.EntitySets.TestFixture+User,
  Xtensive.Storage.Manual, Version=, Culture=neutral, PublicKeyToken=null: 1
Dmitri's Id: 1
Dmitri's favorite page count: 2
Dmitri's favorite page: Alex Yakunin's Blog (
Dmitri's favorite page: Xtensive Web Site (
Dmitri's favorite blog count: 1
Dmitri's favorite blog: Alex Yakunin's Blog (

So Denis Krjuchkov recently has made a great job ;) Nested transactions are fully operable.

Btw, this sample is a kind of "cheat sheet" for basic operations on Entities and EntitySets.

Using DataObjects.Net with LINQPad

LINQPad is a very nice tool for testing LINQ queries. Unfortunately, there is no public extension API allowing DataObjects.Net to be integarted there natively. But there are workarounds.

Let's start from screenshots:
image image

It is possible to make it showing SQL for queries, but I recommend using SQL Profiler for this (it shows batches):
image image
image image

You may find the queries shown here are taken from nearly the same example as I reviewed in my previous post:
  var query = 
  from customer in Query<Customer>.All
  select new {
    Customer = customer,
    Orders = customer.Orders,
    First5Orders = (
      from order in Query<Order>.All
      where order.Customer==customer
      orderby order.Id
      select order
  string.Format("Total entities: {0}", query.Count()).Dump();
  foreach (var item in query) {
    var subqueryResult = item.First5Orders.ToList(); 
    // Actual execution must happen here, 
    // but see the comments below.
    string.Format("{0} in first 5, {1} total",

Above SQL Profiler output perfectly shows how we prefetch EntitySets. 32 is default count of items to prefetch per each EntitySet. If there are more items, they'll be loaded by additional query (or you should specify the limit using prefetch API).

So, how to “teach” LINQPad to run DataObjects.Net queries?

The fastest way: download, extract DO4Template.linq from it, open it in LINQPad and go to step 11. DO4Template.linq is regular text file with the following content:
<Query Kind="Program">
  <Reference>&lt;ProgramFiles&gt;\Reference Assemblies\Microsoft\Framework\.NETFramework\v3.5\Profile\Client\WindowsBase.dll</Reference>

// You must run DataObjects.Net Console Sample before using this template!

public Domain BuildDomain()
  var c = new DomainConfiguration();
  c.ConnectionInfo = new UrlInfo(
  c.UpgradeMode = DomainUpgradeMode.Validate;
  return Domain.Build(c);

public static void Display<TEntity>(IQueryable<TEntity> query)
  using (Xtensive.Storage.Rse.Providers.EnumerationScope.Open()) {
  var recordSet = ((Queryable<TEntity>) query).Compiled;

void Main()
  var domain = BuildDomain();
  using (Session.Open(domain))
  using (var scope = Transaction.Open()) {
    // scope.Complete();

void Query()
  var query = 
    from c in Query<Contact>.All
    orderby c.FirstName, c.LastName
    select new {
      CompanyName = c.Company.Name
  Display(query); // Our own Display method
  query.Dump(); // Standard LINQPad method

If you would like to know how this file was created,

1. Open LINQPad and select “File” – “New Query” (Ctrl-N) there:

2. Open “Query” - “Advanced Query Properties” (F4):

3. Click “Add'” on “Additional References” tab there, and than - “Browse”:
image image

4. Navigate to your DataObjects.Net installation folder and open “Bin\Latest” folder there:

5. Select all the assemblies there, and add them:
image image

6. Add reference to WindowsBase.dll (from .NET 3.5) and System.Transactions.dll (from .NET 2.0):
image image

7. Add reference to your model and all its dependencies. In our case it will be Xtensive.Storage.Samples.Model and its dependencies:

8. Go to “Additional Namespace Imports” tab and add the following namespaces there:
The last namespace there is namespace of your model.

9. Select “Language” - “C# Program” in LINQPad query window:

10. Add the following BuildDomain, Display, Main and Query methods:
public Domain BuildDomain()
  var c = new DomainConfiguration();
  c.ConnectionInfo = new UrlInfo(
  c.UpgradeMode = DomainUpgradeMode.Validate;
  return Domain.Build(c);

public static void Display<TEntity>(IQueryable<TEntity> query)
  using (Xtensive.Storage.Rse.Providers.EnumerationScope.Open()) {
  var recordSet = ((Queryable<TEntity>) query).Compiled;

void Main()
  var domain = BuildDomain();
  using (Session.Open(domain))
  using (var scope = Transaction.Open()) {
    // scope.Complete();

void Query()
  var query = 
    from c in Query<Contact>.All
    orderby c.FirstName, c.LastName
    select new {
      CompanyName = c.Company.Name
  Display(query); // Our own Display method
  query.Dump(); // Standard LINQPad method

11. Run the original application to populate the data. In our case this is Console Sample (don’t forget to select “Microsoft SQL Server” there):

12. Press F5 to run the program we just wrote. You must see the following output:

That’s it. Now you can write your own queries and dump their results there. But remember:
  • Dump() method in LINQPad dumps all the references recursively. This means it will stuck on paired associations and will dump a part of Domain.Model while dumping Entity.Type property.
  • This explains why we use anonymous type projections in our queries: they restrict Dump() greediness.
  • As you see, it's pretty easy to write your own Dump method here.
Note that such an approach does not bring any huge benefits in comparison to e.g. running the same queries in separate project in Visual Studio. You get the ability to visualize any query result, and that's all. But I think this is pretty good for testing queries quickly. All you must do is to create your own .linq file for your project.

November 27, 2009

New feature of DataObjects.Net v4.1: subquery batching

We implemented this feature few weeks ago. Imagine we execute the following code:
var query = 
  from customer in Query<Customer>.All
  select new {
    Customer = customer,
    First5Orders = (
      from order in Query<Order>.All
      where order.Customer==customer
      orderby order.Id
      select order
var queryResult = query.ToList(); // Actual execution
Console.WriteLine("queryResult.Count: {0}", 
foreach (var item in queryResult) {
  var subqueryResult = item.First5Orders.ToList(); 
  // Actual execution must happen here, 
  // but see the comments below.
  Console.WriteLine("subqueryResult.Count: {0}", 
As you see, this is a typical case where you must get 1+N queries:
  • First query will be the main one
  • All the others are its subqueries. As far as we know, any other ORM will execute a  particular one of them on attempt to enumerate.
So e.g. if queryResult.Count==90, you must get 91 queries - a particular example of "Select N+1" issue.

But DO4 will send just 6 batches!

The first one is:
  [a].[Phone], [a].[Fax] 
  [dbo].[Customers] [a];
All the subsequent ones look like this:
exec sp_executesql N'SELECT TOP 5 [a].[OrderId], 
[a].[TypeId], [a].[ProcessingTime], [a].[ShipVia.Id], [a].[Employee.Id], [a].[Customer.Id], 
[a].[OrderDate], [a].[RequiredDate], [a].[ShippedDate], [a].[Freight], [a].[ShipName], 
[a].[ShippingAddress.StreetAddress], [a].[ShippingAddress.City], [a].[ShippingAddress.Region], 
[a].[ShippingAddress.PostalCode], [a].[ShippingAddress.Country] FROM [dbo].[Order] [a] 
WHERE ([a].[Customer.Id] = @p1_0) ORDER BY [a].[OrderId] ASC;

-- ...
-- A set of similar queries is skipped to shorten the output
-- ...

SELECT TOP 5 [a].[OrderId], [a].[TypeId], [a].[ProcessingTime], [a].[ShipVia.Id], 
[a].[Employee.Id], [a].[Customer.Id], [a].[OrderDate], [a].[RequiredDate], [a].[ShippedDate],
[a].[Freight], [a].[ShipName], [a].[ShippingAddress.StreetAddress], [a].[ShippingAddress.City], 
[a].[ShippingAddress.Region], [a].[ShippingAddress.PostalCode], [a].[ShippingAddress.Country] 
FROM [dbo].[Order] [a] WHERE ([a].[Customer.Id] = @p16_0) ORDER BY [a].[OrderId] ASC;
',N'@p1_0 nvarchar(5),@p2_0 nvarchar(5),@p3_0 nvarchar(5),@p4_0 nvarchar(5),@p5_0 
nvarchar(5),@p6_0 nvarchar(5),@p7_0 nvarchar(5),@p8_0 nvarchar(5),@p9_0 nvarchar(5),@p10_0 
nvarchar(5),@p11_0 nvarchar(5),@p12_0 nvarchar(5),@p13_0 nvarchar(5),@p14_0 nvarchar(5),@p15_0 
nvarchar(5),@p16_0 nvarchar(5)',

As you see, we execute such subqueries as future queries - i.e. they're performed in batches. This does not mean we materialize the whole query result at once - instead, we process it part by part:
  • When you pull out the first item, we materialize first 16 items & cache them. If there are subqueries, they're processed as future queries transparently for you.
  • When you pull out 16th item, we materialize 32 more of them at once by the same fashion.
  • And so on; maximal size of such a bulk is 1024.
  • Note that we called .ToList() here, so it was actually fully enumerated at that moment, and thus all the batches were executed during .ToList() processing. But if we'd use it in foreach loop and break from it, only a part of result would be materialized.
So such a materialization process allows us to optimize the interaction with RDBMS (reduce the chattiness) transparently for you. The process is fully recursive - so e.g. if subquery contains other subqueries, they'll be resolved by the same fashion. Moreover, if you select EntitySet in final selector, it is prefetched by the same way.

So this is a good alternative to prefetch API.

November 24, 2009

Visual tools: model designer & profiler

Finally I decided to surrender and start following "visual path". Visual tools are what people expect now -   frequently it does not matter how good your product / API is: no visual tools = there is nothing to speak about. Or, differently, if you're so cool, why you don't deliver visual designers, which are commonly expected? And you know, it's impossible to argue with this. It's simply clear we need them.

The next question is what people really expect from us. Currently I see the following weak areas:

Visual model designer

That's what business analysts and beginners expect first of all. We're going to build it over Visual Studio DSL Tools. I always though this is pretty hard problem, but after looking on DSL tools closer I discovered it must be much simpler than I though. Moreover, MEScontrol developers successfully use model designer built by MESware for integrators (it generates v3.9 models) - it is based on  DSL tools. So there is already a kind of prototype we can look at. Let's describe what our designer must do:
  • Design persistent models - add, edit and remove persistent classes (Entities, Structures and EntitySets) and properties there;
  • Support application of all DataObjects.Net attributes, such as [Association]; it must be possible to define mappings there as well.
  • Provide customizable T4 template-based code generation of entity code. Obviously, we'll generate partial classes you can extend with your own code.
  • Support modelling of IUpgradeHandlers, and, likely, even their automatic updates on changes in model.
  • It must be possible to reference externally defined persistent types there, including custom-typed EntitySets. This will allow to build separated models for each part of the application.
  • It should support reverse engineering - a feature allowing to (re)generate the model from existing database.
Query profiler

Possibly, this is even more important part. We're going to combine query debugging features of profiler we hav in v3.9 (if you don't know, it is very similar to LINQPad) and tracing features of NHProf to make a process of debugging DO4-based applications really simple and productive. We expect it providing the following features:
  • Possibility to attach a profiler to any remote DataObjects.Net Domain, if profiling is enabled in its configuration.
  • Event tracing. Nearly the same UI as in SQL Server Profiler, although I'd like to see better categorization, filtration and grouping features there.
  • Event analysis. Basically, we must be able to attach integrated & custom analyzers to the event streams we produce, and I hope Rx Framework will help us a lot here. Since results of analysis are event streams as well, their visualization must be similar to event tracing.
  • Custom code execution. Yes, I'd like this to be possible. DataObjects.Net profiling API must allow profiler to push C# \ VB.NET code to the server and execute it there capturing all the event.
  • Result visualization. Such custom code must be able to return results back to the profiler for visualization. I feel this is one of the most important and complex parts there: I'd like to see which properties of persistent objects are loaded and which are not, explore the relations there, support very large collections of entities, and, moreover, I'd like to be able to edit everything there.
  • Likely, later it will support other ORM tools. But our initial goal is to perfectly support just DO4.
So profiler must act not just as tracing & debugging tool, but nearly as SQL Server Management Studio: in fact, it allows you to do everything except changing the model.

Timeframe: we're going to start works on both parts from the beginning of December; visual designer is of #1 priority, so I hope we'll be able to show its alpha by the end of this year. Likely, this will delay some of planned v4.2 features, but not quite: I hope Alex Ilyin (LiveUI author, he will join DO4 team for few months) will help us a lot with this.

Any ideas and opinions are welcome.

Nightly builds are back again ;)

Likely, you have noticed there were no nightly builds on the previous week. Now the issue is fixed, and they're back again. Moreover, v4.1 installer is updated to the today's nightly build - I assure you it is very stable (~ 6-7 tests are failing there dependently on configuration - that's normal, they show tests for works in progress and few known issues).

If you have some time, please download and install the latest build: your feedback would significantly help us to deliver stable v4.1 release on the end of this week. As you know, installer was one of the most disappointing parts we had so far, and I hope the current one won't suffer from any problems of its predecessors. Please compare your own installation results with expected ones.

I know what I suggest is quite similar to "help yourselves" ;) But... I feel there is no other good way to do this. We have a limited set of configurations, and although everything works locally, I still get reports & fix some pretty strange issues there. E.g. few days ago one of DO4 users helped us to identify a serious bug preventing DO4 from installing on 32-bit Vista. I can't imagine why there was  "HKLM\SOFTWARE\Wow6432Node\Microsoft\VisualStudio\9.0" subkey on 32-bit Windows, but it is the reason of the problem. And I suspect this isn't a very unique case - similar issue was described at our support forum earlier, but that time we were unable to identify the bug. So testing installer for complex framework is complex. But I hope we'll accomplish this - at least by this way ;)

DataObjects.Net "exit poll": conclusions

If you ever uninstalled DataObjects.Net, you know we run "exit poll" there. And so far it was constantly highlighting two most annoying categories of issues:
  • Documentation. Mainly, you indicated there must be a manual you can read sequentially.
  • Installer. It was quite buggy. A month ago there were issues even on 32-bit Vista; users of 64-bit Windows had almost zero chances of getting DO4 samples running.
I glad to say both of these issues must disappear with release of v4.1. As you know, we're working on Manual - it is missing just few chapters now. Installer was quite significantly improved during October-November; its current version is free of all the identified bugs. And... I hope you'll help us to test it.

So upcoming v4.1 looks promising. I hope you'll like using it ;)

P.S. Actually we made one more conclusion: we need a visual designer and support for reverse engineering (conversion of generally arbitrary schema to our persistent types). But that's the topic for the next post.

November 20, 2009

Microsoft StreamInsight vs Rx Framework

Few days ago Microsoft StreamInsight was presented on Urals .NET User Group. Nikita Samgunov, the author of presentation, have made a really good overview of its features and architecture, so it was really interesting.

But I left it confused: the same day Rx Framework became available, so I had a chance to look on it much closer - and I was really impressed. It was clear StreamInsight solves almost identical problem, but there are significant differences:
  • StreamInsight is positioned as CEP framework. Rx is positioned as general purpose event processing framework. This must mean StreamInsight should solve some specific problems much better than Rx. But as it initially looks like, this is arguable.
  • StreamInsight uses LINQ-to-CepStream. CepStream is IQueryable (we checked this - e.g. it fails on translation of ToString() \ GetHashCode()), so there is a LINQ translator for it. But Rx uses custom LINQ extension methods to IObservable monad. So StreamInsight builds a program transforming CepStreams while compiling the expression stored in IQueryable, but Rx does the same "on the fly" - event crunching machine gets built while LINQ combinators are applied to IObservables one after another.
  • It seems StreamInsight adds implicit conditions in some cases, such as "event intervals must overlap" for joins. I'm not fully sure if this is really correct, since I didn't study StreamInsight so closely, but at least it was looking so. Of course, the same is possible with Rx, but you must do this explicitly.
But there are many similarities as well:
  • Both frameworks run all the calculations in memory. There is no persistent state.
  • Both frameworks can be hosted inside any application.
  • Both frameworks are ready for concurrent event processing. It seems StreamInsight executes everything in parallel by default; the same is possible in Rx, but what's more important, concurrency is fully controllable there (btw, that's really impressive advantage of Rx: events are asynchronous by their nature, so concurrency looks much more natural there, than e.g. in PLINQ).
After summarizing all this stuff for my own, I made the following conclusion: Rx is what definitely worth to be studied (it seems I wrote that earlier ;) ), but StreamInsight... Is, possibly, a dead evolutional chain. 

First of all, Rx seems much more powerful and generic. I really can't imagine why I should compile the queries. If there is Rx, an approach provided by StreamInsight looks like writing enumerable.ToQueryable().[Your query].ToEnumerable() instead of just enumerable.[Your query].

Moreover, I clearly understand how complex this problem is - to write a custom LINQ translator. Most of developers are able to write their own extension methods to IEnumerable; many are capable to rewrite the whole LINQ to enumerable. But there are just a handful of teams that wrote their own LINQ translator. So presence of this layer in StreamInsight looks as unnecessary complexity. If so, it will evolve much slower, will be more complex to extend (e.g. with Rx you can use custom methods without any additional code; but the same won't work with StreamInsight), etc.

On the other hand, there is a statement from leaders of both teams stating that both frameworks are useful.

I think Erik Meijer should simply say: "occasionally we kicked StreamInsight team's ass" - but certainly, this seems not what really possible ;) Phrases like "It (Rx) is particularly useful to reify discrete GUI events and asynchronous computations as first class values." looks especially funny, if you seen any videos with Erik Meijer on Rx (this one is a very good intro): Rx application area is much wider than just GUI event processing. In fact, Rx offers a new language (DSL inside C#, F# and so on) allowing you to describe asynchronous computations much more naturally (or distributed ones, such as Paxos). Taking into account we can increase just CPU count (or machine count) now, but not their frequency, Rx appearance seems very important. If you looked up the video, you should remember Erik Meijer said, that, likely, he can retire right now -  he's almost fully sure this is the nicest abstraction he invented.

So... Study Rx, not StreamInsight ;)

P.S. Amazing, how few persons (or may be, even just a single person) may change the way we think. And how fast a new technology they built can kill the technology it was originated from.

Silverlight 4: binary assembly compatibility with .NET 4!

That's the feature I always wanted to see! This means now we can use the same .csproj files to target both .NET 4 and Silverlight. Earlier it was much more complex.

Here is full list of new features in Silverlight 4 beta.

November 19, 2009

Status update: v4.1 release is scheduled for the next week, plans for v4.2

We're going to release v4.1 on the next week:
One more good news is that a part of our team has already switched to the next set of features scheduled for v4.2:
  • Nested transactions. They'll allow you to rollback a part of the changes you made. As before, all the modifications made to entities will be returned back transparently.
  • Global cache. Mainly, we're going to provide an API allowing to plug any implementation there. Initially we'll support integrated LRU cache (nearly as in v3.9) + Velocity. Flexible expiration policy, cacheable queries, version checks only on writes are among features we're going to implement here. Btw, most complex part we need here is already done: DisconnectedState utilizes exactly the same API (SessionHandler replacement) to make Session exposing the data it caches.
  • Localization. Initially it will be impossible to use different collations for different localizations, but the API itself will be much better (and much more explicit) than in v3.9.
  • Access control system. As you may find, we are going to provide an API, that will be much more "open". Moreover, the ACL structure there will be fully relational, so you'll be able to utilize them in queries (e.g. to return only the objects you can access).
  • O2O mapping. The idea is fully described, so please see the link. We have finally decided this is the best option. Another idea is to support WCF serialization right by our entities: this approach has a set of disadvantages - you'll be able to marshal entities only "as is"; so e.g. you won't be able to expose the same entity differently for different services or versions of API. The advantage is that no additional coding is necessary, if you want to marshal entities. So in general, O2O looks better here: it provides much better flexibility. The same set of server-side entities and various DTOs for different WCF APIs (or its parts - e.g. for CustomerForm, SalesReportForm and so on) seems almost ideal solution.
So we're going forward, and v4.2 will be one more very important milestone for us. In fact, it will bring almost all the features we had in v3.9 (the only left ones are full-text search and partitioning), and will allow using it with both WCF and ADO.NET Data Services. Likely, .NET RIA Services will also be supported after getting O2O mapping done, so we'll provide a complete spectra of supported communication APIs.

November 12, 2009

A bit surprising: KudoRank #9 out of 10 at Ohloh

Few of us have got #9 out of 10 KudoRank @ Ohloh almost immediately, although I have just a single Kudo. I thought currently it must be based mainly on our commit count.

So... Are we pretty productive from the point of Ohloh? Anyway, this seems a bit strange.

DataObjects.Net Manual: links to written parts

Oh... Do you know writing a manual is, likely, what developers hate the most? ;) So making Russian developer to write manual in English is, well... The hell.

Now, seriously: this work is going on, but slower than I hoped. Currently we wrote about 70%, and may be 85% of the most important chapters. Many of written parts require proofing - the Runglish there is perfect, but it's far from acceptable English.

Now the good news: I discovered Google Code allows to browse files online "as-is", so I can give you links to mainly finished chapters:

As you see, everything is written in HTML. But producing PDF from HTML is easy. So finally it will be joined into a single PDF book. Our help files will contain it as well.
And the most important part: if (or, more precisely, when) you will find a mistake while reading the text, please:
  • Select it
  • Press Ctrl-Enter
  • Press Enter or click "Send".
P.S. Don't waste your time on "Lazy loading" - there are tons of them. We must review it internally first.

November 11, 2009

Visual Studio .NET project template icons

We just added them – here they are:


A tiny, but pleasant improvement ;)

Weekly bunch of links

Google’s Go: A New Programming Language That’s Python Meets C++
That's even a bit funny: Go does not support exceptions, generics, inheritance and have pretty strange interfaces. It seems there is no any functional stuff as well. But there is GC. See its FAQ for details.

So IMHO it does not worth to be studied. At least now.

Version tolerant serialization
Recommended for anyone who deals with serialization on .NET.

If you pay attention to functional stuff, but write mainly on C#, that's a good article for you.

Recommended, if you're interested in F#.

You say Tomato, I say Pomodore
One more description of well-known Pomodore method (time management).

That's simply what I was looking for ;)

Mathematics of elections (in Russian)
That's about elections in Russia. There are very strong statistical evidences showing that major elections in Russia from 2007 were cheated to increase the count of votes for United Russia or their particular members.

If you're interested in details, here is Google translation of this article. The original publication (in .PDF) is here (again, in Russian).

LOL ;) That's to make this post more complete ;)

DataObjects.Net and ORMBattle.NET at Ohloh

We have registered DataObjects.Net at Ohloh, so now you can see some statistics.

Commit graph there is wrong - I wrote I wiped out a part of its development history (whole 2007 year) during repository migration; in fact, all these changes must be shown as a single #0 revision, and thus there must be a peak at the first commit. But, by some reason, Ohloh shows this differently - may be because Mercurial support there is still in Beta.

Everything else - e.g. lines of code counts, looks correct. I checked this for C# ;). Even JavaScript code shown there is not a mistake - actually this is jQuery from ASP.NET MVC sample (so some part of code there isn't our own code - but likely, this is related mainly to JavaScript).

ORMBattle.NET is also registered there. Its statistics seems fully correct, but much less impressive ;)

If you use one of these projects, please spend few minutes on registration there.

A bit faster way to checkout DataObjects.Net

Earlier I wrote how to checkout and built latest version of DataObjects.Net. But since repository is pretty large, and Mercurial sends you its copy with the whole history, this can be a long process. So here is the way to make this a bit faster and controllable:
  • Download two parts of .rar archive containing current repository snapshot;
  • Extract the archive;
  • Open command prompt in "DataObjects.Net" folder there;
  • Type "hg pull" to get the most current changes;
  • Type "hg up -C" to update your working copy to the latest revision.
That's it.

P.S. Just measured: .hg folder in repository is 210 Mb (that's what is sent via HTTP), and my .rar is 190 Mb. So Mercurial rules ;) Thus it's reasonable to use this way only if you have some problems with the standard one.

November 10, 2009

DataObjects.Net and SQL Azure

Right now we're testing DO4 with SQL Azure, and it already works. We've made few changes related to compatibility with Azure SQL:
  • Azure SQL does not support MARS, but DO4 utilizes this feature to implement on-demand materialization of query results. Fortunately, we already implemented adapters for non-MARS DbDataReaders while working on future queries. There is a very similar problem: if you switch the reader to the next result, previous one becomes inaccessible; but since you must provide access to all of them, the only option you have is to cache their content. So we used the same solution here, and now DO4 does not require MARS for SQL Server, but utilize it when it is available.
  • Azure SQL does not support tables without clustered primary keys. Likely, you know that DO creates all the tables with clustered primary key, if this option is supported by the database. But, as we discovered, there was one exception: key generator tables. Obviously, it's not important if keys there are clustered or not - these tables are always empty. But I'd prefer them to be clustered as well - just to not break the genericity; moreover, in case with Azure this is really important. So we fixed this when this was discovered.
  • Azure SQL does not support tables without primary keys. Or, more precisely, it allows to create such tables, but does not allow them to contain the data. But our schema upgrade layer produces upgrade sequences that may e.g. drop the primary key from some table (containing the data) and create it again, if its structure was changed. So in general, it is incompatible with this requirement. That's what we're fixing right now. But if you use Recreate or Validate schema upgrade mode, this isn't important.
That's all. After getting first and second issue fixed, all our tests (except tests for schema upgrade - they wait for the third issue to be fixed) have passed on SQL Azure.

So you'll be able to connect to SQL Azure using today's nightly build (or the latest repository snapshot). Connection URL must be the same as for SQL Server:
  • "sqlserver://"
Btw, we added a separate "Azure" version of SQL Server provider for SQL DOM. So in general, we were ready to handle complex differences in SQL we produce for it. Fortunately, this wasn't necessary ;)

Poll: ConnectionUrl vs connection string

What would you prefer to use further: our own, unified, but non-standard connection URL or standard connection string?

Please vote on the front page.

November 6, 2009

ORMBattle.NET scorecard is updated

Check out the newest scorecard. Most important changes:

1. There are LINQ to SQL tests now - thanks to Igor Tkachev, the author of BLToolkit.

2. DataObjects.Net 4 is returned back to the scorecard. I think now this is fully honest. Unfortunately we don't lead on performance tests now - BLToolkit immediately became the fastest player there, and that was predictable. But:
  • IMHO its implementation of our CUD Multiple tests is at least arguable - in fact, its test code does manually what others do automatically there. But this is very natural to it.
  • I think we will be able to show at least quite comparable results on CUD Multiple and Materialization tests till NY. I know our current result is already very good, taking into account additional levels of abstraction we have; moreover, optimizations we involve there are fully automatic, so generally any application will benefit of them. But I know there are ways allowing us to be at least twice better there, and I hope we will find some time to get them implemented.
3. BLToolkit got its LINQ tests. Btw, I know initially it was getting ~ 30% score, but during one month Igor has raised it to ~ 40%. That's really impressive, since I know this requires lots of hard work.

4. There are significant changes in the scorecard, but I'm going to comment this in separate post a bit later.

5. Our test suite now compiles and runs on any PC without any third-party tools. Mainly, all you need is to download it, or checkout and compile it. More complete instruction is here. So if you'd like to add another ORM tool or develop your own test - you're welcome!

P.S. I suspect 5) is not completely true yet: we must update DO4 there to support Win64.

November 3, 2009

Our coding standards and style: links

Since it is possible to push your own code modifications to DataObjects.Net, it's right time to give few more links:
Please read at least the following documents before starting to modify our code:

Pushing the changes to DataObjects.Net source code repository

You must complete 2 steps:

1. Request Project Committer permission by sending us an e-mail from your Google account.

2. Right-click on the folder where you repository clone is stored and select "TortoiseHg" - "Repository Settings". Set the specified options there as it is show on screenshots below.

Set your own user name. Preferable one is "Name Surname", although nicknames are ok as well.

Specify your Google account name and password by double-clicking on "default" alias. You can also select "Fetch after Pull" there - normally this is desirable option.

The following dialog will appear after double-clicking on "default" alias. Account name and password must be specified there.

After these actions your hgrc file must look like:

name = DataObjects.Net
description = DataObjects.Net project repository
contact = [email protected]
allow_archive = bz2
allow_push = *
push_ssl = False
encoding = UTF-8

default =[email protected]/hg/

postpull = fetch

Steps for editing its [web] sections aren't described here, but this section is necessary only if you are going to expose your repository via web (e.g. using hg serve).

So another way to configure your repository for pushes is to edit your hgrc file - e.g. using Notepad. The file is located right in <YourRepositoryRoot>\.hg folder.

Note: as I wrote earlier, the revisions you see now are produced by one-way sync from our Subversion repository. So if you’ll push your own commits right now, they will be branched aside from the primary branch we update, and it won’t be possible to merge them into the primary branch until we completely migrated to Mercurial. So if you will modify the code, please do not push the changes at all until migration at our side is completed. This will take just about a week.

Btw, Mercurial allows you to merge your changes locally and push them later – it is truly decentralized system.

DataObjects.Net and 64-bit Windows

In short, DataObjects.Net is now almost fully compatible with 64-bit Windows. "Almost" - because currently there is no 64-bit Oracle provider for SQL DOM, so you can't use DataObjects.Net with Oracle on 64-bit Windows for now.

Of course, this implies its installers (they were updated today) and all the scripts there like Install.bat / Build.bat now properly work in 64-bit environment.

ASP.NET MVC Sample (NerdDinner port) for DataObjects.Net

I can announce NerdDinner port for DataObjects.Net (ASP.NET MVC application) is already available in our source code repository, so you can try it right now.

  • IIS 6.0 or higher
  • SQL Server 2005 or higher
  • DataObjects.Net v4.1 RC or higher
  • ASP.NET MVC 1. You can find the distributive in "Install" folder.
  • DO40-Test database on SQL Server.
To download, build, install and run it:
Expected result is show on this screenshot:

Some pleasent benefits of hosting the sources at Google Code

Google Code provides few really convenient tools:

Building DataObjects.Net from source code

This really easy:
  • Install TortoiseHg - the Mercurial client.
  • Create folder named "DataObjects.Net" at any place you like. Folder name is also upon your wish.
  • Clone the repository: right-click on it and select "TortoiseHg" - "Clone a Repository", set Source Path to and click "Clone". Wait while TortoiseHg downloads the repository. Its size is about 300Mb, so it can be a long process, although a lot depends on your connection speed. I just tested this - in my case it took about 12 minutes. Note that you'll have its full replica containing complete change history on completion of this process. 
  • An alternative to the previous step: open command prompt in "DataObjects.Net" folder and type:
    hg clone .
  • Open command prompt in "DataObjects.Net" folder and type:
    Install\InstallAll.bat -b - to get the project built (in Release configuration) and installed. This command will install samples, project templates and everything else. See this post for detailed description of actions performed by Install.bat.
That's all. The copy you'll have will be almost identical to the one shipped via installer, the only difference is absence of .HxS and .Chm help files. Currently you can take them from installer, but shortly we'll be publishing them separately. They must be copied to "Help" folder - if this is done, InstallAll.bat will integrate them into Visual Studio .NET help collection. Or you can simply run Install.bat from Help folder.

To remove everything, run Install\Uninstall.bat and remove "DataObjects.Net" folder.

November 2, 2009

Migration from Subversion to Mercurial: issues and workarounds

If repository is complex, forget about "hg convert + hg transplant" way:
  • Likely, hg convert convert everything. See this post for details.
  • Hg transplant (as well as hg export + hg import) will fail on any of cases listed below.
1. Mercurial on Windows does not "understand" renames changing only case of file or folder name.

To handle this, I wrote a sequence of commands renaming such "manually".

2. Patches produced by hg export --git aren't always properly processed by hg import. As far as I can judge, import fails if such a patch contains file rename + its subsequent modification. So I used export without --git option to properly migrate such patches.

It's reasonable to ask why I didn't use this way for any patch. Well, it does not work at all if there are binary modifications.

So if I'd have binary modifications mixed up into such patches, this approach won't work.

3. If you're going to push produced repository to Google Code, it won't accept large binary files. By my impression it does not accept files larger than 100 MB.

In our case there is just a single file of such size: AdventureWorks.vdb3, which is used in VistaDB provider tests. So my conversion script was simply truncating this file to few bytes in any revision where it was added or moved.

4. Moreover, it seems Google Code simply rejects any sequence of patches that is larger than 100 MB.

So implemented a simple script pushing the changes in bulks of any desirable size.

That's all. So there are issues, but they can be handled. The scripts I used to convert our repository are provided below.

hg clone[email protected]/hg/ DataObjects.Net

@echo off
set TargetRepo=DataObjects.Net
set Part1Repo=Xtensive1
set Part2Repo=Xtensive2
set OldAdventureWorksVdb3=Xtensive.Sql\Xtensive.Sql.Dom.Tests\VistaDb\AdventureWorks.vdb3
set AdventureWorksVdb3=Xtensive.Sql\Xtensive.Sql.Tests\VistaDb\AdventureWorks.vdb3

rmdir /S /Q %TargetRepo% >nul 2>nul
rmdir /S /Q Diffs >nul 2>nul
mkdir Diffs >nul 2>nul

echo Converting part 1:
rmdir /S /Q %Part1Repo%
hg convert "D:\Users\Common\Repositories\Xtensive" %Part1Repo% --authors Authors.txt --filemap Filemap-Xtensive.txt --config convert.svn.startrev=8156 --rev 11539
echo  Done.

echo Converting part 2:
rmdir /S /Q %Part2Repo%
hg convert "D:\Users\Common\Repositories\Xtensive" %Part2Repo% --authors Authors.txt --filemap Filemap-Xtensive.txt --branchmap Branchmap-Xtensive.txt --config convert.svn.trunk=Trunk --config convert.svn.branches=Empty --config convert.svn.tags=Tags
echo  Done.

echo Migrating part 1:
if not exist "%TargetRepo%" call "Clone-%TargetRepo%-GoogleCode.bat"
for /L %%i IN (0,1,2580) do (
  if "%%i"=="0" (
    if not exist "Diffs\%Part1Repo%-%%i.done" (
      call :MigrateGit %Part1Repo% %%i --no-commit
      pushd %TargetRepo%
        call :Truncate "%OldAdventureWorksVdb3%"
        hg add "%OldAdventureWorksVdb3%"
        if not "%ERRORLEVEL%"=="0" exit
        hg commit -m "Initial import."
        if not "%ERRORLEVEL%"=="0" exit
  ) else if "%%i"=="656" (
    call :MigrateNoGit %Part1Repo% %%i
  ) else if "%%i"=="767" (
    call :MigrateNoGit %Part1Repo% %%i
  ) else if "%%i"=="768" (
    call :MigrateNoGit %Part1Repo% %%i
  ) else if "%%i"=="835" (
    if not exist "Diffs\%Part1Repo%-%%i.done" (
      call :SafeRename "Xtensive.Storage.Samples/Xtensive.Storage.Samples.WPF" "Xtensive.Storage.Samples/Xtensive.Storage.Samples.Wpf"
      echo Migrated. > "Diffs\%Part1Repo%-%%i.done"
  ) else if "%%i"=="836" (
    if not exist "Diffs\%Part1Repo%-%%i.done" (
      call :SafeRename "Xtensive.Storage.Samples/Xtensive.Storage.Samples.Wpf/Xtensive.Storage.Samples.WPF.csproj" "Xtensive.Storage.Samples/Xtensive.Storage.Samples.Wpf/Xtensive.Storage.Samples.Wpf.csproj"
      echo Migrated. > "Diffs\%Part1Repo%-%%i.done"
  ) else if "%%i"=="1085" (
    if not exist "Diffs\%Part1Repo%-%%i.done" (
      call :SafeRename "Release.ProjectTemplate/DataObjects.Net 4.0 project/DataObjects.Net 4.0 project.vstemplate" "Release.ProjectTemplate/DataObjects.Net 4.0 project/DataObjects.Net 4.0 Project.vstemplate"
      echo Migrated. > "Diffs\%Part1Repo%-%%i.done"
  ) else call :MigrateGit %Part1Repo% %%i
echo  Done.

echo Migrating part 2:
for /L %%i IN (1,1,5000) do (
  if "%%i"=="4" (
    echo Skipping revision %%i - it moves existing files into Trunk from the outside, but they're already there in %Part1Repo%
  ) else if "%%i"=="181" (
    if not exist "Diffs\%Part2Repo%-%%i.done" (
      call :SafeRename "Common\Config.nikolaev.targets" "Common\Config.Nikolaev.targets"
      echo Migrated. > "Diffs\%Part2Repo%-%%i.done"
  ) else if "%%i"=="246" (
    if not exist "Diffs\%Part2Repo%-%%i.done" (
      call :MigrateGit %Part2Repo% %%i --no-commit
      pushd %TargetRepo%
        call :Truncate "%AdventureWorksVdb3%"
        rem hg add "%AdventureWorksVdb3%"
        rem if not "%ERRORLEVEL%"=="0" exit
        hg commit -m "Merged changes from SqlDom branch"
        rem if not "%ERRORLEVEL%"=="0" exit
  ) else call :MigrateGit %Part2Repo% %%i
echo  Done.
goto :End

set GitOption=--git
goto :Migrate

set GitOption=
goto :Migrate

echo Migrating %2 revision:
set done=..\Diffs\%1-%2.done
if exist "%1\%done%" (
  echo Skipping %2: already migrated.
  goto :End
set diff=..\Diffs\%1-%2.diff
pushd %1
  if not exist "%diff%" (
    echo   Exporting %2...
    hg export %2:%2 -o "%diff%" %GitOption%
    if not "%ERRORLEVEL%"=="0" exit
  call :DetectCommentAndTag %2
pushd %TargetRepo%
  echo   Importing %2...
  hg patch "%diff%" %Comment% --import-branch %3 %4 %5 %6 %7 %8 %9
  if not "%ERRORLEVEL%"=="0" exit
  if not "%Tag%"=="" (
    echo   Tagging %2 as %Tag%
    hg up tip
    hg tag -f -m "Tag created: %Tag%" "%Tag%"
    if not "%ERRORLEVEL%"=="0" exit
  echo Migrated > %done%
echo   Done.
goto :End

pushd %TargetRepo%
  echo   Renaming: '%~1' to '%~2'.
  hg up
  if not "%ERRORLEVEL%"=="0" exit
  hg rename "%~1" "%~1.tmp"
  if not "%ERRORLEVEL%"=="0" exit
  rmdir /S /Q "%~1" >nul 2>nul
  hg rename "%~1.tmp" "%~2"
  if not "%ERRORLEVEL%"=="0" exit
  rmdir /S /Q "%~1.tmp" >nul 2>nul
  hg commit -m "Safe rename: '%~1' to '%~2'"
  if not "%ERRORLEVEL%"=="0" exit
  hg up
  if not "%ERRORLEVEL%"=="0" exit
goto :End

echo   Truncating %1...
echo Truncated. > %1
goto :End

set Comment=-m "No comment."
set Tag=
for /F "tokens=1,2* delims=: eol=" %%i in ('hg log -r %1') do (
  if "%%i"=="summary" call :ResetComment
  if "%%i"=="tag"     call :SetTag "%%j"
goto :End

set Comment=
goto :End

set Tag=%~1
set Tag=%Tag:~9%
goto :End


@echo off
set PushBatchSize=100

pushd DataObjects.Net
  for /L %%i IN (0,%PushBatchSize%,5000) do (
    Echo Pushing changes up to revision %%i...
    hg push -r %%i

DataObjects.Net source code is uploading to Google Code

This great day has come: DataObjects.Net is now true open source product. From this day you can find its complete source code at Google Code.

Check out:
I must say it was really tricky to convert big Subversion repository to Hg. Tomorrow I'll describe all the issues I faced, but for now it's enough to say you'll see all the project updates at Google Code ~ on weekly basis.

Currently we aren't switched from our internal Subversion repository to Mercurial. We'll use Subversion internally until all the stuff sensitive to repository type (e.g. installer builder and build servers) is migrated to Mercurial. This process will take about one week (the speed here isn't really important). After this moment Subversion will die completely.

We run our own Mercurial repository as well - now it just pulls out the updates from Subversion and periodically pushing them to Google Code. But in near future all our developers will simply sync with it.

Subversion is almost dead. Long live Mercurial!