Friday, February 29, 2008

Does LINQ to SQL replace the whole DAL?

Probably the balanced answer should be "it depends on the application you are building", but I usually go for a straight "no way" - there's always time to subtleties.

At least in any SQL Server based application that is not trivial (in terms of data model and user concurrence), there's little doubt about the key role that stored procedures should still play. Their functionality may now be complemented with LINQ to SQL direct operations in a mix whose exact proportion depends on the application context.

This may range from air-tight corporate DB environments (no direct access allowed)  to more pragmatic scenarios in which a big chunk of the simple data operations are easily implemented as LINQ queries. In almost any case, some stored procedures are still required as the best way of solving complex queries and handling complex or critical updates.

Having established this, there's still a question about the necessity of a "duplicate" data access layer, considering that LINQ to SQL does allow stored procedure calls. In my opinion, several reasons make a separate, full-featured DAL an important component of you application

  • Full control over the SP invocation: your generic data access code may be customized with the desired exception handling, parameter examination and completion, and any fine-grain control your application requires.
  • Ability to return or fill datasets: in several situations (i.e. reporting) datasets are still a simple and flexible way to carry a list of tabular data thru your application tiers.
  • If you prefer to return custom objects (or you are a TDD / mockable objects fan), you may extend the DAL to materialize a POCO instance or collection from a Data Reader obtained thru a SP call.

The last assertion - returning a POCO instance or collection in your custom DAL, instead of just calling the SP thru LINQ to SQL is motivated by the need to establish a clear rule of use:

  • Use the LTS data access for direct LINQ queries and updates
  • Use the custom DAL for all SP calls.

How to implement the custom objects materialization from a DataReader ? You may write your own code -examining attributes or thru reflection - or you may just delegate this task to the Translate method of a LTS Data Context, inside your own DAL. Assuming the necessary LTS attributes are present in the Customer class,  this code works:

    DbDataReader dr = cmd.ExecuteReader();
myDataContext.Translate<Customer>(dr);

NOTES:


  • For a reason unknown to me, the provided DC should hold a valid database connection, even when data is pulled from the already open data reader.


  • And if you are thinking of this as a way to workaround LINQ to SQL 's ties to SQL Server (after all, a DbDataReader may be obtained from many data sources) forget it: the Translate method with throw an exception when the reader is not a SqlDataReader. Nice try!



Side topic: for an interesting discussion on the typed dataset vs LINQ to SQL objects, you may read this entry in Aaron's Technology Musings blog.

Wednesday, February 27, 2008

Unplugged LINQ to SQL Generator

As several smart guys already pointed out,  the LINQ to SQL (LTS) package as it is doesn't always fit well in an N-Tier architecture. On this topic you may read Rick Strahl 's early post on "LINQ to SQL and attaching Entities" and Nick Kellet's review on Planet Moss: LINQ To SQL ≠ N-Tier Architecture?

The RTM version offers a way of detaching thru serialization and re-attaching thru the Attach API for the frequent scenario of Web Services / WCF distributed apps. And even some capability of serving "non tracked" objects thru the EnableObjectTracking data context property.

But in my opinion this is not good enough for building an "all-terrain" business layer, serving disconnected DTOs which may be either light-weight POCOs or more complex entities with stand-alone change tracking, in the same way the good old dataset allows. A discussion on this topic can be found in Benjamin Eidelman previous entry.

That said, it's also palpable that many of the tools in the LTS bag are just too good to be ignored. So maybe the key for a successful adoption of this technology is to have a clear understanding of its pieces and combine then in the way that best serve your particular scenario.

With that in mind, we started by manually coding entities to replace the ones generated by the LTS custom tool (MSLinqToSqlGenerator), and we come across a couple of interesting conclusions:

  • The Data Context class does not need to have strongly typed table properties for making the queries: a generic data context may be used, and invoking GetTable<yourEntity>() provides the virtual collections on which LINQ queries are based. This allows us to break the awkward coupling of DC and entities in a single (big) DBML file, and take a more flexible approach.

  • Almost any class decorated with the right LTS attributes is able to be used as a query entity, even pure POCOs (Plain Old CLR objects). The default classes, closely tied up to the DC by notify events and EntitySet collections, are not mandatory for using the power of the LINQ provider in LTS.

At that point we decided we wanted to develop our own custom tool for translating the DBML build by the O/R designer into our own entities code. As we intended our classes to be detached from the DC by design, the name "Unplugged LINQ to SQL Generator" was an unanimous decision.

So we are walking that path. The first release of our custom tool, a proof of concept with limited capabilities, was published today as a Code Plex project: http://www.codeplex.com/ULinqGen .

Stay tuned! More to come soon...

Wednesday, February 13, 2008

First thoughts on Designing a LINQ-enabled Application Framework

 

In the previous weeks (and the followings :), we've been intensively stressing the different LINQ-to-SQL features, lots of prototyping and architecture sketches, where made trying to obtain some conclusions about: What role (if any at all) do we want to give to LINQ in the architecture of our applications?

First of all, we need to answer a basic question "Do we want to add LINQ to our model?"

As Jose wrote on the previous article, there's no doubt we love LINQ as set of language extensions, combined with a set providers (LINQ-to-*) allowing us to write elegant strong-typed queries over heterogeneous collections without knowing all their specific APIs.

Considering this, it would be great to have a LINQ "Queryable" data access layer. With that idea,  we started to analyze LINQ-to-SQL integration in Enterprise applications of different scale.

 

Two-tier (logically separated) WinForms/WPF Application

This is our simple case: a presentation layer designed to be always physically connected to a business layer retrieving business entities from a local Data Source.

Even while there's a logical separation between layers, they share a common Application Domain. It's the case of simple desktop application accessing directly to a local (or remote) database.

Unit of Work

When we retrieve entities from our database we will use a DataContext, who follows the Unit-of-Work pattern.

It stays alive during a single business operation, handling the SQL Connection, and tracking changes on all the entities associated to it.

Every time we insert, modify or delete entities from a DataContext it updates an in-memory ChangeSet, with copies of the original and modified values of this entities.

Finally, when we finished working with them, we tell to the DataContext to submit this changes, and all the necessary commands are sent to the database. Then it's ready to be disposed.

The DataContext follows the Unit-of-Work pattern.

This is absolutely great in this connected environment. We query our DataContext, bind the IQueryable result to a BindingSource, a DataBinded Grid, edit, insert, or delete records, and when we are ready, all we need to do is MyDataContext.SubmitChanges();

And this won't only update any change we made to the entities, it will handle foreign-keys, concurrency checks and transactability.

This also means that the entities belong to their DataContext during all their lifetime, this wiring allows features as deferred loading (of properties, associated entities, child collections) and db-generated fields.

For a lot of reasons, this seems to be the main scenario for which the current Linq-To-Sql implementation has been designed.

 

N-tier (physically separated) Application

Let's try to scale the previous approach to a N-tier application, in this case our Business Layer is exposed thru a Service Layer, consumed (WCF) by a physically remote presentation layer (Winforms/WPF client, Asp.Net website, etc.)

How does LINQ-to-SQL supports this scenario?

Initially, we could say that LINQ-to-SQL will remain behind the Business Layer, and won't trespass the WCF barrier.

Out-of-Topic There's a few adventurous developers (here is a project in CodePlex) implementing serialization of Expression Trees, allowing to query a remote collection exposed thru a WCF Service, serializing the query (represented in an Expression Tree), deserializing it on the server, and returning the results to the client.

 

But there's something we surely want to move around these layers, Entities.

 

The auto-generated LINQ-to-SQL entities, can get decorated (selecting unidirectional serialization in the O/R Designer) with [DataContract] and [DataMember] attributes, allowing them to travel as parameters or results of a WCF Service Operation.

As expected, this would break the connected state of this entities, loosing all the cool features we had in the previous scenario (change tracking, deferred loading, etc.)

Those aren't actually very bad news, because having that features would encourage data-centric practices, opposed to the SOA model, that WCF is based on.

If we look to the Fowler's Lazy Load pattern description "An object that doesn't contain all of the data you need but knows how to get it.", we can note that the last underlined words, are in deep contradiction with the Persistence Ignorance pattern that we are trying to follow.

One of the reasons for this is that Linq, and all the new language extensions in C# 3.0 and VB9, eases  the handling of POCO entities, a principle that LINQ-to-SQL and the new Entity Framework seems to take advantage of.

In this scenario, having our entities detached from the DataContext is something we want. And by-design, entities get detached when serialized.

When this entities (or collection of entities), return modified to the Business Layer, they are detached, we just need to attach them to a new DataContext and submit their changes.

As track changing has been broken, when we re-attach an entity to a DataContext we need to tell how this entity must be updated, specifying:

  • Original and current copies of the entities

or simply:

  • Only current copies, as all modified

 

In few words, the ability to re-attach entities, adds basic N-tier support to LINQ-to-SQL, cutting off all the magic features (see Change Tracking, Deferred Loading, etc) that the connected state gave us.

 

One size fits all Solution

The previous scenarios seems to be well handled by the current LINQ implementation. But, an immediate conclusion we had studying them, it's that they imply a different logic behind the business layer.

The connected nature of the first type of application, is certainly un-scalable to the second, having a DataContext alive thru all the lifetime of an entity is unacceptable in an enterprise application model.

Besides that, it would be a bad choice in Asp.Net website to keep the DataContext (with it's ChangeSet) alive in memory between postbacks.

We want the 2 tiers in the Two-tier application, to be not only logically separated, but "physically separable", that would improve scalability (allowing reuse of the business layer in an N-tier application), and force a better responsibility-delegation between business and presentation layers.

Disclaimer: Forcing a "one size fits all" solution, "N-tier ready", implies some over-engineering for people building a simple desktop RAD applications (like in first scenario), but our main concern is focusing in Enterprise Solutions.
In this Two-tier simpler always-connected desktop app, a possible advice could be: use LINQ-to-SQL "as it is".

 

All this took us to the significant choice of allowing only detached entities outside the business layer.

That implies destroying the DataContext after the entities are retrieved, and re-attaching them to a new DataContext at the moment of submitting changes. Many people got there, and found themselves struggling with the "Attach only when detached" nightmare. Rick Strahl is one of them (or should I say, us).

Attach only when detached

As explained in the Dinesh Kulkarni's blog detaching-attaching of Entities has been thought for N-Tier scenarios only, that's why attaching is allowed only for entities who has been previously serialized and deserialized.

That works great in N-Tier, but serializing-deserializing has no sense in a common application domain.

A workaround that many people had found for this, is roughly cut all the wires between entities and their DataContext, that can be accomplished resetting some Event Handlers, and replacing some deferred-loading-aware collections (EntitySet and EntityRef) with a simpler array or List<T>.

Fortunately for us, there's a feature in LINQ-to-SQL that comes to solve (in a more elegant way) this issues!.

LINQ-to-SQL POCO support

Even while the O/R Designer and SqlMetal provide automatic generation of wrapper-classes over our data entities, it's perfectly allowed to use our own POCOs decorated with the appropriate attributes.

The POCO movement (nicely explained here), it's based on the Persistence Ignorance pattern, which ensures Responsibility Delegation, in other words, we don't want our Entities to know anything about how they are persisted. The default auto-generated entities, pretty much respect this principle, they don't know anything about persistence (part of this info is in attributes decoration or mapping files).

But they do participate in their persistence mechanism!, by being closely associated to a DataContext, not only notifying changes, but loading deferred values or associated entities from it.

This behavior is mainly achieved thru change notifying events (declared in INotifyPropertyChanging/ed interface), and the new types EntityRef and EntitySet.

This two classes are used in auto-generated entities to load (lazy or not) properties created from foreign-keys, EntityRef is used for single reference (as in Product.Manufacturer.Name), and EntitySet for child-collections (as in Manufacturer.Products[2].Price).

They not only contain associated entities, they have the logic for deferred loading, and notifying the DataContext about modifications in references and child collections, allowing the change tracking feature.

As we read in LINQ-to-SQL blogs, it's possible to replace this types, with simpler, disconnected versions, EntityRef, can be replaced by a direct reference, and EntitySet by any ICollection<T>.

 

Putting the pieces together

With this ideas in mind, we started building prototypes, messing with O/R Designer and SqlMetal auto-generated code.

Putting together the pieces we want, replacing/discarding others.

These days we're starting to see the light and the end of tunnel, with custom tools and code we started to write.

More on this on following posts...

Monday, February 11, 2008

LINQ to SQL: Stories of love and hate

So, we are trying to figure out in which ways the LINQ technologies can make our apps better, and our life as developers easier.

As a whole, there's no doubt the new language extensions in .NET 3.5 and the LINQ concepts are a hit. Once you find you can use the same syntax / extensions for every possible collection in your app and happily forget all those boring APIs like DataView.RowFilter or Array.Sort, there's no return. And this is just the tip of the iceberg, considering the potential of the new language extensions.

The plot thickens, however, when you focus on the LINQ to SQL current implementation (.NET Framework 3.5 and VS 2008 release versions). After spending many days trying to find the place for LINQ to SQL in a layered application architecture, I came up with a mix of happiness and frustration which I will share here - even at the risk of being overly simplistic.

Love:
The magic of expression tree evaluation that turns a multi-step query expression into a single Transact-SQL statement

Hate:
The fact that the query output - when not projecting into new types - is always a collection of items tied up into the data context, not able to be detached and sent up and down the layers of a multi-tier app. And this is by design, acording to Dinesh Kulkarni's post.

Love:
The inclusion of an O/R designer in the VS IDE, with good enough capabilities for mapping table schemas to a DBML markup.

Hate:
The O/R designer inability to figure out the stored procedure return schema in all but the more trivial cases (single fixed resultset)

------------

Am I not being fair if I don't mention the wonders of the data context SubmitChanges behavior?

Well... being focused on enterprise architectures, the concept of a single happy application committing changes into the DB all the way from the front-end to the back-end is not my preferred scenario. That's why I'm more concerned about interacting with stored procedures, and using LINQ to SQL to get detached entities or DTOs.

So maybe the problem is that I'm trying to inject this technology in a scenario for which it has not been built?

Let's see. Some portions of LINQ to SQL  definitely may fit in a layered application architecture. But the whole combo (O/R designer + code generation tool + System.Data.Linq classes) is too large a bite, and the key may be to take apart it's pieces.

Time to go back to the lab! More on this to come soon...

Friday, February 8, 2008

About Tercer Planeta

Tercer Planeta is a small software firm in Buenos Aires, Argentina.

http://www.tercerplaneta.com/

We focus on software architecture, development and implementation targeted to solve our customer’s specific needs.

We have a strong commitment with professional quality combined with common sense, in giving our client the best possible context-aware guidance.

To achieve that we are putting special effort in keeping ourselves a step-forward by taming upcoming Microsoft technologies, improving the development processes and participating in the Microsoft community.

Why “Tercer Planeta”?

The fast trends of our society can give us a distorted perspective about the role of technology at service of people.

“Tercer Planeta” (Third planet) refers to our home planet. Putting ourselves in a universe perspective keeps us aware of the relative scale of all the technology buzz.

Thursday, February 7, 2008

Brief Introduction

"You are what you blog"?

There you are. Just another gross overreaction to the technology and social trends.

Exactly the kind of pitfall we try to avoid, and a good starting point for defining what and why we blog.

As a group (small group) of technology consultants at the southern tip of the Americas, we try to give our customers the most up-to-date-yet-wise advice available at any given moment. And that means a lot of research and reading to be done, coffee to be drink and sheets of paper to be trashed. Not to mention the countless code prototypes to be built and discarded.

When dust is settling down and a concept -even a small one- seems to emerge, it's the right moment for it to be tested by exposing it as a blog entry. And there we go.

Which are our blog subjects? Mostly software architecture and Microsoft development technologies, generally evolving around our main concern: the best way of building enterprise software applications.