Wednesday, March 12, 2008

Implementing N-Tier Change Tracking with Linq to SQL

 

When designing our application data model, we think some of our entities as an in-memory cache of small pieces of data living in a database (MSSQL, Oracle, Xml files, etc.). This data will be jumping between both worlds.

To keep this "cache" synchronized, we need mechanisms to read data from storage, and update changes back, to perform that we create CRUD operations.

The "U" in CRUD is the subject of this entry, update changes back.

Once we read an entity from the db we must track changes on it, in order to reproduce them in the storage.

To facilitate this, ADO.Net brought features as the DataRow.RowState property. Based on that we wrote code, that typically uses the RowState to perform the corresponding creations, updates, or deletes in the db.

What happens when we have Linq to SQL custom classes?

 

Linq to SQL DataContext Object Tracking

When entities are obtained from a DataContext object, this DataContext subscribes to change notification events on every entity, and tracks changes automatically, generating an internal change set. When we want to persist changes, we call the DataContext SubmitChanges() method and all the INSERT, UPDATE and DELETE commands are sent to the SQL Server.

This mechanism works great in a connected scenario, but in an N-tier architecture, modifications of your entities can be done in a machine far far away from the DataContext that created them, they can't (or shouldn't! :)) notify this changes thru the wire.

As we explained in previous posts, we need detached entities.

How do Linq to SQL supports this scenario? It allows you to detach and re-attach entities, but now change tracking is your job.

Detaching entities can be achieved in two ways:

  • serializing-deserializing entities, deserialized entities are detached by nature.
  • setting the DataContext ObjectTrackingEnabled property to false (after that all entities obtained are not tracked).

Notice that the first option would force serialization and deserialization even in a connected Winforms app,  so we'll always use the second.

Once your entity comes back modified to the data access layer you can re-attach it to a new DataContext using the Attach() method in every Linq Table, but you must tell if this entity has been modified/created/deleted (or not).

"Those who don’t know their history are doomed to repeat it."

That means we need to track this information on the client, and send it back to the server. We need to build our own... disconnected change tracking!

 

Disconnected Change Tracking

Of course we are not the first in getting here!, using Linq to SQL in an N-tier scenario is something many people is working on. Actually the same issue is been discussed in the Entity Framework world, because there seems to be no official solution there either (yet).

So I googled a bit to see what others are doing. And most of the solutions can be grouped in this two categories:

Portable Client-DataContext

At an early stage of Linq to SQL Microsoft were planning to ship this with the first release, as explained by Mat Warren, he described it like this:

"The mechanism that does the change tracking on the client is similar to a mini-connectionless DataContext.  It is a type that packages up all the objects, lists of objects and graphs that you want to send to the client.  It serializes itself and everything you've given it automatically.  (It implements IXmlSerializable.)  On the client, the same class also manages change-tracking for all the objects that were serialized with it.  When serialized again (on the trip back) it serializes both the objects and their change information that it logged.  Back on the middle tier, you just make one call to re-attach the whole package to a new DataContext instance and then call SubmitChanges."

Finally, this didn't get to the current release, and many people came up with their own implementation of it, deeper ruminations on this approach can be found in this OakLeaf article.

The best thing here is "entity pureness", entities completely ignore persistence, they can be pure POCOs, without a base class. But be aware that this pureness is not so absolute, entities must implement interfaces an events for property change notification, we just don't note this because the implementation is in auto-generated code.

Also, this means that an entity never knows his own dirty state or original values, to know it you must reference this portable context. This makes rollback changes a complicated task (actually there's no support for rollback to original values in the Linq to SQL DataContext).

This also requires some "packing" and "unpacking" code in the client.

 

Entities with state flags

The other approach implies including in every entity a state field, and eventually original values. An attractive point here is that this is just what our old well known ADO.Net DataSet has been doing all this time (se DataRow.RowState property). It's a pattern we've seeing for years in change tracking.

We can achieve this using an entity base class (this can be set in a Dbml file, but unfortunately this attribute is not visible in the O/R Designer).

To avoid sticking ourselves to an specific implementation we can use an "entity with state" interface.

 

The Tercer Planeta's choice

Matthew Hunter is working on the same problem, and asked in a previous post which way are we taking, that's why I decided to write this entry.

When standing against this fork in the road so crucial in our lives, first of all we breath deeply, and thought why we need this, and decide to list the kind of changes we want to track. And we found at that all we need to track is:

  • Dirty state in every entity
  • Added/Removed/Existent states only in entity collections

Why we don't want Added/Removed/Existent states in single entities? Because that info should be in the business logic, in other words the business logic knows if I'm creating, deleting or updating. We don't want a generic "ApplyChanges" method internally doing an insert, update or delete at will.

This means, if I'm in the ProducEditForm and click on the "save" button, I expect the Form (and not the entity) to tell me which action to perform (create, update or delete).

Based on that we came up with a separation of concerns here.

We chose the path of entities with state, in this path of green hills and crystal clear waters, every entity knows his dirty state thru a boolean flag in their base class. But adds/deletes of children are tracked by a custom collection class (replacing EntitySet).

Optionally, entities can keep a copy of their original values (for change rollback and concurrency checking when there's no timestamp), to allow this we may force ICloneable implementation.

Tracking must be activated explicitly. This doesn't only allow optional "read-only" mode, but also to distinguish new and pre-existent (on the db) entities.

ULinqGen compatibility

To facilitate change tracking the way I described above, a minor surgery is required on the Linq to SQL auto-generated entities... thanks god we are building a code generator custom tool for Dbml!

We are adding some improvements to our ULinqGen tool.

Our code generator must support change tracking but without binding it to our (or any) custom change tracking implementation.

What should we add to auto-generated code?

  • Entity base class, Linq to SQL already has this feature, so no extra work would be necessary. the only drawback is that the EntityBaseClass attribute is global to all the entities in a Dbml, and is not visible from the O/R Designer (you can write it on the Dbml file with notepad). We'll probably add capability to specify per-entity base class.
  • Property change notification, but as we don't have a listening context, we only need a generic "I'm getting dirty!" instance method call, we are requiring an IEntityState interface with this method. The implementation is let to the base class (if any), it could be an extension method of IEntityState!.
  • We want to support optional "original values tracking", to allow this ULinqGen could easily generate an ICloneable implementation based on the entity metadata.

This will keep our code generator tool pretty much "naive" about the specific change tracking system. And read-only mode is still the default.

I'm not showing any code because we're currently discussing much of this, so you'll have to wait for the following ULinqGen releases! ;)

Tuesday, March 11, 2008

Fitting LINQ to SQL in an Application Framework

I've gathered some of the ideas we've discussing on the LINQ to SQL technologies and how to fit them in our own layered application framework and wrote an article for the Level Extreme .NET online magazine.

The topics covered are:

  • What's good in the current LTS implementation.
  • Why it's not ready to fit - as it is - in a layered application framework.
  • Aspects of LTS we need to override.
  • How to tear apart the LTS package, use what we need and provide the rest.

The article concludes with the need of a custom LINQ to SQL generator ( see our CodePlex ULinqGen Project )  and of a custom way to provide disconnected change tracking on the entities.

Read the full article here.

Saturday, March 1, 2008

How to use the Unplugged LINQ to SQL Generator

As Chris Rock pointed out, the first release of our Code Project custom tool for LINQ to SQL code generation lacks any usage documentation. We'll fix this on the next few days, meanwhile these are the basic instructions:

Install the Custom Tool

After building (VS 2008) the ULinqGen project, you may either register the assembly with regasm, or build and run the setup project. Then you should close the IDE (all VS2008 instances) and reopen it to make the new custom tool available.

Code a generic Data Context class in your project

It may be as simple as this:
    public class MyDataContext : DataContext
{

private static System.Data.Linq.Mapping.MappingSource
mappingSource = new AttributeMappingSource();

public MyDataContext(string connection) :
base(connection, mappingSource)
{
}
}

Associate each DBML to the custom tool

On the project explorer, for each model in which you want to use this tool, set the "Custom Tool Name" property to "ULinqToSQLGenerator"

Get tables from your generic data context and you are ready to LINQ

Assuming your DBML has an Invoice entity with InvoiceItem children and a Customer foreign relation, it may look as this:
    MyDataContext dc = new MyDataContext( myConnectionString);
Table invoices = dc.GetTable<Invoice>();

var results = from i in invoices
where i.IsApproved
&& i.Customer.FirstName.StartsWith("John")
&& i.InvoiceItems.Count > 2
order by i.ApprovalDate
select i;
return results.ToArray();

From here it's up to you!