March 16, 2010

Can ORM make your application faster? Part 3: CRUD sequence batching

This is the 3rd part of sequence of posts describing optimization techniques used by ORM tools.

Optimization techniques

2. CRUD sequence batching

This is one of the most widely adopted optimizations, that are automatically employed by ORM tools. Instead of sending each INSERTUPDATE or DELETE command separately, they're sent to database server in batches (normally - by 20...30 commands in each batch).

Batches are either regular SQL commands, where individual statements are separated special separator character (normally ";"), or regular SQL command objects "glued" together with provider-specific API. 

As is was mentioned, the main advantage of this optimization is that it happens completely automatically, so generally, you shouldn't care about this at all to get all the benefits of its presence for free.

This optimization is currently available in most of leading ORM products (although e.g. Entity Framework and LINQ to SQL do not implement this).

Implementing it using plain SQL is pretty complex, if you don't know the exact sequence of CRUD commands that are planned to be executed further (pretty frequent case - normally there some "if"s), and thus you can't build such a batch by a simple template. I'll list just few complexities here:
  • You don't know the size of batch. But there are certain limits you must enforce, e.g. query parameter count limit and query text size limit.
  • If parameters are used in regular SQL batches, you must ensure they're uniquely named there.
  • It's resonable to send neither short nor large batches. Short batches increase chattiness, large batches are executed much later then they could, that's generally bad as well. So you must care about splitting the CRUD sequence to such "average" batches.
  • Finally, you must know the moment until which you may delay execution of the current batch. Normally the batch must be flushed, if it might affect on result of operation, that is planned to be executed now (e.g. query). ORM precisely knows this moment, because all queries go "through" it, and thus it is able to make a decision. But to achieve the same in regular application, you must have a similar single-point query API (ideally, with analyzer ;) ).
Return to the first post of this set (introduction and TOC is there).