In the previous weeks (and the followings :), we've been intensively stressing the different LINQ-to-SQL features, lots of prototyping and architecture sketches, where made trying to obtain some conclusions about: What role (if any at all) do we want to give to LINQ in the architecture of our applications?
First of all, we need to answer a basic question "Do we want to add LINQ to our model?"
As Jose wrote on the previous article, there's no doubt we love LINQ as set of language extensions, combined with a set providers (LINQ-to-*) allowing us to write elegant strong-typed queries over heterogeneous collections without knowing all their specific APIs.
Considering this, it would be great to have a LINQ "Queryable" data access layer. With that idea, we started to analyze LINQ-to-SQL integration in Enterprise applications of different scale.
Two-tier (logically separated) WinForms/WPF Application
This is our simple case: a presentation layer designed to be always physically connected to a business layer retrieving business entities from a local Data Source.
Even while there's a logical separation between layers, they share a common Application Domain. It's the case of simple desktop application accessing directly to a local (or remote) database.
Unit of Work
When we retrieve entities from our database we will use a DataContext, who follows the Unit-of-Work pattern.
It stays alive during a single business operation, handling the SQL Connection, and tracking changes on all the entities associated to it.
Every time we insert, modify or delete entities from a DataContext it updates an in-memory ChangeSet, with copies of the original and modified values of this entities.
Finally, when we finished working with them, we tell to the DataContext to submit this changes, and all the necessary commands are sent to the database. Then it's ready to be disposed.
The DataContext follows the Unit-of-Work pattern.
This is absolutely great in this connected environment. We query our DataContext, bind the IQueryable result to a BindingSource, a DataBinded Grid, edit, insert, or delete records, and when we are ready, all we need to do is MyDataContext.SubmitChanges();
And this won't only update any change we made to the entities, it will handle foreign-keys, concurrency checks and transactability.
This also means that the entities belong to their DataContext during all their lifetime, this wiring allows features as deferred loading (of properties, associated entities, child collections) and db-generated fields.
For a lot of reasons, this seems to be the main scenario for which the current Linq-To-Sql implementation has been designed.
N-tier (physically separated) Application
Let's try to scale the previous approach to a N-tier application, in this case our Business Layer is exposed thru a Service Layer, consumed (WCF) by a physically remote presentation layer (Winforms/WPF client, Asp.Net website, etc.)
How does LINQ-to-SQL supports this scenario?
Initially, we could say that LINQ-to-SQL will remain behind the Business Layer, and won't trespass the WCF barrier.
Out-of-Topic There's a few adventurous developers (here is a project in CodePlex) implementing serialization of Expression Trees, allowing to query a remote collection exposed thru a WCF Service, serializing the query (represented in an Expression Tree), deserializing it on the server, and returning the results to the client. |
But there's something we surely want to move around these layers, Entities.
The auto-generated LINQ-to-SQL entities, can get decorated (selecting unidirectional serialization in the O/R Designer) with [DataContract] and [DataMember] attributes, allowing them to travel as parameters or results of a WCF Service Operation.
As expected, this would break the connected state of this entities, loosing all the cool features we had in the previous scenario (change tracking, deferred loading, etc.)
Those aren't actually very bad news, because having that features would encourage data-centric practices, opposed to the SOA model, that WCF is based on.
If we look to the Fowler's Lazy Load pattern description "An object that doesn't contain all of the data you need but knows how to get it.", we can note that the last underlined words, are in deep contradiction with the Persistence Ignorance pattern that we are trying to follow.
One of the reasons for this is that Linq, and all the new language extensions in C# 3.0 and VB9, eases the handling of POCO entities, a principle that LINQ-to-SQL and the new Entity Framework seems to take advantage of.
In this scenario, having our entities detached from the DataContext is something we want. And by-design, entities get detached when serialized.
When this entities (or collection of entities), return modified to the Business Layer, they are detached, we just need to attach them to a new DataContext and submit their changes.
As track changing has been broken, when we re-attach an entity to a DataContext we need to tell how this entity must be updated, specifying:
- Original and current copies of the entities
or simply:
- Only current copies, as all modified
In few words, the ability to re-attach entities, adds basic N-tier support to LINQ-to-SQL, cutting off all the magic features (see Change Tracking, Deferred Loading, etc) that the connected state gave us.
One size fits all Solution
The previous scenarios seems to be well handled by the current LINQ implementation. But, an immediate conclusion we had studying them, it's that they imply a different logic behind the business layer.
The connected nature of the first type of application, is certainly un-scalable to the second, having a DataContext alive thru all the lifetime of an entity is unacceptable in an enterprise application model.
Besides that, it would be a bad choice in Asp.Net website to keep the DataContext (with it's ChangeSet) alive in memory between postbacks.
We want the 2 tiers in the Two-tier application, to be not only logically separated, but "physically separable", that would improve scalability (allowing reuse of the business layer in an N-tier application), and force a better responsibility-delegation between business and presentation layers.
Disclaimer: Forcing a "one size fits all" solution, "N-tier ready", implies some over-engineering for people building a simple desktop RAD applications (like in first scenario), but our main concern is focusing in Enterprise Solutions. In this Two-tier simpler always-connected desktop app, a possible advice could be: use LINQ-to-SQL "as it is". |
All this took us to the significant choice of allowing only detached entities outside the business layer.
That implies destroying the DataContext after the entities are retrieved, and re-attaching them to a new DataContext at the moment of submitting changes. Many people got there, and found themselves struggling with the "Attach only when detached" nightmare. Rick Strahl is one of them (or should I say, us).
Attach only when detached
As explained in the Dinesh Kulkarni's blog detaching-attaching of Entities has been thought for N-Tier scenarios only, that's why attaching is allowed only for entities who has been previously serialized and deserialized.
That works great in N-Tier, but serializing-deserializing has no sense in a common application domain.
A workaround that many people had found for this, is roughly cut all the wires between entities and their DataContext, that can be accomplished resetting some Event Handlers, and replacing some deferred-loading-aware collections (EntitySet and EntityRef) with a simpler array or List<T>.
Fortunately for us, there's a feature in LINQ-to-SQL that comes to solve (in a more elegant way) this issues!.
LINQ-to-SQL POCO support
Even while the O/R Designer and SqlMetal provide automatic generation of wrapper-classes over our data entities, it's perfectly allowed to use our own POCOs decorated with the appropriate attributes.
The POCO movement (nicely explained here), it's based on the Persistence Ignorance pattern, which ensures Responsibility Delegation, in other words, we don't want our Entities to know anything about how they are persisted. The default auto-generated entities, pretty much respect this principle, they don't know anything about persistence (part of this info is in attributes decoration or mapping files).
But they do participate in their persistence mechanism!, by being closely associated to a DataContext, not only notifying changes, but loading deferred values or associated entities from it.
This behavior is mainly achieved thru change notifying events (declared in INotifyPropertyChanging/ed interface), and the new types EntityRef and EntitySet.
This two classes are used in auto-generated entities to load (lazy or not) properties created from foreign-keys, EntityRef is used for single reference (as in Product.Manufacturer.Name), and EntitySet for child-collections (as in Manufacturer.Products[2].Price).
They not only contain associated entities, they have the logic for deferred loading, and notifying the DataContext about modifications in references and child collections, allowing the change tracking feature.
As we read in LINQ-to-SQL blogs, it's possible to replace this types, with simpler, disconnected versions, EntityRef, can be replaced by a direct reference, and EntitySet by any ICollection<T>.
Putting the pieces together
With this ideas in mind, we started building prototypes, messing with O/R Designer and SqlMetal auto-generated code.
Putting together the pieces we want, replacing/discarding others.
These days we're starting to see the light and the end of tunnel, with custom tools and code we started to write.
More on this on following posts...