When designing our application data model, we think some of our entities as an in-memory cache of small pieces of data living in a database (MSSQL, Oracle, Xml files, etc.). This data will be jumping between both worlds.
To keep this "cache" synchronized, we need mechanisms to read data from storage, and update changes back, to perform that we create CRUD operations.
The "U" in CRUD is the subject of this entry, update changes back.
Once we read an entity from the db we must track changes on it, in order to reproduce them in the storage.
To facilitate this, ADO.Net brought features as the DataRow.RowState property. Based on that we wrote code, that typically uses the RowState to perform the corresponding creations, updates, or deletes in the db.
What happens when we have Linq to SQL custom classes?
Linq to SQL DataContext Object Tracking
When entities are obtained from a DataContext object, this DataContext subscribes to change notification events on every entity, and tracks changes automatically, generating an internal change set. When we want to persist changes, we call the DataContext SubmitChanges() method and all the INSERT, UPDATE and DELETE commands are sent to the SQL Server.
This mechanism works great in a connected scenario, but in an N-tier architecture, modifications of your entities can be done in a machine far far away from the DataContext that created them, they can't (or shouldn't! :)) notify this changes thru the wire.
As we explained in previous posts, we need detached entities.
How do Linq to SQL supports this scenario? It allows you to detach and re-attach entities, but now change tracking is your job.
Detaching entities can be achieved in two ways:
- serializing-deserializing entities, deserialized entities are detached by nature.
- setting the DataContext ObjectTrackingEnabled property to false (after that all entities obtained are not tracked).
Notice that the first option would force serialization and deserialization even in a connected Winforms app, so we'll always use the second.
Once your entity comes back modified to the data access layer you can re-attach it to a new DataContext using the Attach() method in every Linq Table, but you must tell if this entity has been modified/created/deleted (or not).
"Those who don’t know their history are doomed to repeat it."
That means we need to track this information on the client, and send it back to the server. We need to build our own... disconnected change tracking!
Disconnected Change Tracking
Of course we are not the first in getting here!, using Linq to SQL in an N-tier scenario is something many people is working on. Actually the same issue is been discussed in the Entity Framework world, because there seems to be no official solution there either (yet).
So I googled a bit to see what others are doing. And most of the solutions can be grouped in this two categories:
Portable Client-DataContext
At an early stage of Linq to SQL Microsoft were planning to ship this with the first release, as explained by Mat Warren, he described it like this:
"The mechanism that does the change tracking on the client is similar to a mini-connectionless DataContext. It is a type that packages up all the objects, lists of objects and graphs that you want to send to the client. It serializes itself and everything you've given it automatically. (It implements IXmlSerializable.) On the client, the same class also manages change-tracking for all the objects that were serialized with it. When serialized again (on the trip back) it serializes both the objects and their change information that it logged. Back on the middle tier, you just make one call to re-attach the whole package to a new DataContext instance and then call SubmitChanges."
Finally, this didn't get to the current release, and many people came up with their own implementation of it, deeper ruminations on this approach can be found in this OakLeaf article.
The best thing here is "entity pureness", entities completely ignore persistence, they can be pure POCOs, without a base class. But be aware that this pureness is not so absolute, entities must implement interfaces an events for property change notification, we just don't note this because the implementation is in auto-generated code.
Also, this means that an entity never knows his own dirty state or original values, to know it you must reference this portable context. This makes rollback changes a complicated task (actually there's no support for rollback to original values in the Linq to SQL DataContext).
This also requires some "packing" and "unpacking" code in the client.
Entities with state flags
The other approach implies including in every entity a state field, and eventually original values. An attractive point here is that this is just what our old well known ADO.Net DataSet has been doing all this time (se DataRow.RowState property). It's a pattern we've seeing for years in change tracking.
We can achieve this using an entity base class (this can be set in a Dbml file, but unfortunately this attribute is not visible in the O/R Designer).
To avoid sticking ourselves to an specific implementation we can use an "entity with state" interface.
The Tercer Planeta's choice
Matthew Hunter is working on the same problem, and asked in a previous post which way are we taking, that's why I decided to write this entry.
When standing against this fork in the road so crucial in our lives, first of all we breath deeply, and thought why we need this, and decide to list the kind of changes we want to track. And we found at that all we need to track is:
- Dirty state in every entity
- Added/Removed/Existent states only in entity collections
Why we don't want Added/Removed/Existent states in single entities? Because that info should be in the business logic, in other words the business logic knows if I'm creating, deleting or updating. We don't want a generic "ApplyChanges" method internally doing an insert, update or delete at will.
This means, if I'm in the ProducEditForm and click on the "save" button, I expect the Form (and not the entity) to tell me which action to perform (create, update or delete).
Based on that we came up with a separation of concerns here.
We chose the path of entities with state, in this path of green hills and crystal clear waters, every entity knows his dirty state thru a boolean flag in their base class. But adds/deletes of children are tracked by a custom collection class (replacing EntitySet).
Optionally, entities can keep a copy of their original values (for change rollback and concurrency checking when there's no timestamp), to allow this we may force ICloneable implementation.
Tracking must be activated explicitly. This doesn't only allow optional "read-only" mode, but also to distinguish new and pre-existent (on the db) entities.
ULinqGen compatibility
To facilitate change tracking the way I described above, a minor surgery is required on the Linq to SQL auto-generated entities... thanks god we are building a code generator custom tool for Dbml!
We are adding some improvements to our ULinqGen tool.
Our code generator must support change tracking but without binding it to our (or any) custom change tracking implementation.
What should we add to auto-generated code?
- Entity base class, Linq to SQL already has this feature, so no extra work would be necessary. the only drawback is that the EntityBaseClass attribute is global to all the entities in a Dbml, and is not visible from the O/R Designer (you can write it on the Dbml file with notepad). We'll probably add capability to specify per-entity base class.
- Property change notification, but as we don't have a listening context, we only need a generic "I'm getting dirty!" instance method call, we are requiring an IEntityState interface with this method. The implementation is let to the base class (if any), it could be an extension method of IEntityState!.
- We want to support optional "original values tracking", to allow this ULinqGen could easily generate an ICloneable implementation based on the entity metadata.
This will keep our code generator tool pretty much "naive" about the specific change tracking system. And read-only mode is still the default.
I'm not showing any code because we're currently discussing much of this, so you'll have to wait for the following ULinqGen releases! ;)