Pampa Notes on Software Trends

Thursday, October 25, 2012

Nested (recursive) submodules in Git

Git submodules

Git submodules are the way to implement dependencies between projects.

Let's say you have a common libray 'mylib' shared between apps 'myapp1' and 'myapp2' then you will keep only 3 repositories (1 for each of them).

Both 'myapp1' and 'myapp2' will have 'mylib' declared as a submodule, which is cloned from / pushed to its own repository. And the repository of each parent app will only keep a reference to the commit to use from the submodule repository.

Nested Git submodules

Extending the same principle, 'mylib' could as well be dependant on 'somebaselib', declared as a submodule and hosted elsewhere.

There's plenty of documentation about how to clone or update such repositories (once they exist), using

   git clone ... --recursive
   git submodule update --recursive

But how to build one of those is not so intuitive.

Adding nested submodules: the problem

You may have used this procedure to add a submodule into the 'mylib' repository:

   cd ~/mylib
   git submodule add git@someurl/somebaselib baselib

As the baselib folder does not exist, the previous command does a clone of the specified repository into that folder, and then adds it as a submodule.
You may think the same would to add the nested submodules to the superproject:

   cd ~/myapp
   git submodule add git@someurl/mylib lib

but unfortunatelly this does not work... because the cloning fails.

Adding nested submodules: the solution

The workaround is to manually clone the nested submodules into the subfolder with the --recursive option, before doing the 'submodule add'

   cd ~/myapp
   git clone git@someurl/mylib lib --recursive
   git submodule add git@someurl/mylib lib

Now, as the lib folder does exist, the 'submodule add' command does not try to clone it and just adds it as a submodule. You are all set!

Wednesday, October 3, 2012

Git labeller, branches and dynamic values

After some use of the Git labeller for CCNET described in my previous post, we were faced with an unexpected problem.

On a project following the basic workflow described here, we had the following scenario:

At some point of time, release tag 'v1.1.200.0' was placed on 'qa' branch and that version went into production
Later, while development continued on the 'master' and 'qa' branches, a fix was needed for the previous release, so we've created a branch 'release1.1.200' from the last release tag, commited the fixes into it, and placed a new tag 'v1.1.200.1' on the last commit of the release branch - pushing both the commits and the tags to the origin.
On the CCNET automated build, when getting the code using the Git source control plugin, we specified 'release1.1.200' as the branch to use, and expected to get '1.1.200.1' as the CcNetLabel

But instead of that, we kept getting a '1.1.200.0' label ¿?

By following the CCNET server log, we've found out that the sequence in which the Git source control plugin works is:

repository is cloned or updated
labeller is invoked
branch specified in CCNET configuration is checked out (!)

So when the labeller tries to get the last tag by invoking git describe, it receives something like 1.1.200.0-xxx where xxx specify the commits present after that tag.

Once the problem was clear, the solution was to add to the labeller configuration an (optional) element specifying a branch to ckeckout, and to do so within the labeler before issuing git describe.

As a final touch, we've also upgraded the labeller to support dynamic parameters, so the name of the release branch could be specified dynamically when launching the build. The updated labeller is available at its Gitub project .

Friday, July 20, 2012

Git tag labeller plugin for CCNET

When implementing the basic Git workflow described in my previous post, we needed a way to read the last version tag and other related info and pack it as the "build label" within CruiseControl.Net, the integration platform we use.

More precisely, we needed a valid version identifier to be built in two different scenarios:

Release builds: take the release tag from Git (i.e. "v1.0.4.0"), skip the "v" prefix and return the rest as the version ID.
QA builds: we do not tag each QA build, so we need to take into account the last tag and the number of commits ahead that, then merge them into something like "1.0.4.105" meaning "5 commits ahead of release 1.0.4.0". Why 105 and not 5? Well, we are reserving a block of numbers (1.0.4.1 - 1.0.4.100) to accommodate eventual production fixes on a dedicated release branch.

Enter Git tag labeller plugin for CCNet - we built a plugin to be used as a block in a CCNet configuration script and posted it on GitHub, you can find it here.

Besides the functionality we needed, we added a couple of features some one else may use, like a simple concatenation of the existing git tag with the commits ahead count; i.e. "ReleaseCandidate.6".

See the GitHub project for the most up-to-date documentation and code.

Monday, July 16, 2012

Get Going with a Minimalistic Git Workflow

Yeah, I know. The workflow in Git, as in any DVCS, is something to be taken seriously. Once your whole team is fluid in branching, merging and rebasing you are ready for a full blown Git workflow like git-flow, or the GitHub flow described by Scott Chacon.

But what if you project is small, and / or you've just migrated from SVN and are still trying to figure it all out, and / or you just don't want a model so complicated?

What would be a good starting point? If you need to, lets say:

commit or merge code into a main branch with CI
regularly set milestones for deployment in a QA or staging server
from time to time release a production version - that's what you are paid for, after all
do some maintenance and bug fixing on the released code.

After some research and some practice we choosed the following minimalistic workflow to solve those four needs.

1. The master branch is the place where all developers integrate their code.

This may be done:

By committing their code directly into the master branch - when working in small teams or for small incremental changes.
On a separate feature or develop branch that is later merged into master - when the circumstances require it.

CI tests should be running on this branch.

2. A fast-forward-only qa branch acts as a "sliding tag" following closely the master head.

This allows you to set temporary marks that will be used for deploying in a QA or staging environment. Some of them will later turn into releases - see the following section.

3. A series of annotated version tags signal each release that will go into production.

The version tag for a release (i.e. v1.0.2.0) is always placed on the commit at the current head of the qa branch (not from the master branch, which may be ahead of the last quality assured build).

Note that in this model, only an annotated tag is initially required for deploying a release into production. Only later, if required, a dedicated branch will be created from this tag.

4. Dedicated branches are created for each release that requires maintenance or fixes.

On early stages of an application development , releases may occur quite frequently and most of the times they will not be updated with fixes - the fix will be included in the next full release from the master / qa branches.
In these cases, the version tag mentioned in section 3 will be enough.

However, when the app or site is in production and releases tend to be more sporadic, the requirements for fixes on the released version are quite common. On the first occasion that a fix is required on a newly created version - initially defined only by a tag - a dedicated branch for that specific release should be created (i.e. release1.0.2)

Each fix release for this major version will be created on the same dedicated branch, and will get a separate tag (i.e. v1.0.2.1, v1.0.2.2, ...)

Changes on the release branch may be merged back - as a whole or cherry picked - into the master tag; never into the qa branch that is following the master in a fast-forward way.

The next full release (i.e. v1.0.3.0) will be created as a version tag on the main qa branch, and later branched from there, if necessary.

Credits

The fast-forward qa branch and release tag concepts are well explained in the two last sections of this post from Rein Henrichs.

My friend Benjamin Eidelman was a key participant in the discussion and contributed many of the basic ideas behind this post.

Wednesday, July 4, 2012

Git, Nuget packages and Windows line-endings

The problem

So, you've done your homework and gathered a top set of tools for your web development on a Windows environment: Visual Studio 201x, Git for distributed source control, and a handful of best-of-breed components, i.e.:

jQuery and jQuery.UI
Bootstrap on 'less' as CSS base framework
RequireJs for script loading and dependency handling.

You have added all these third-party components to your solution as Nuget packages.

When setting up your Git repository, you've followed the widely recommended practice for a Windows environment: CRLF conversion to LF when commiting to the repository, and reverse conversion when getting files back into your working area.

So far, so good... or not?

The problems arise when a new version of a package is available and you try to update it.

During the process, the existing package is uninstalled prior to installing the new version, and then a long list of warnings are shown:

Skipping 'Scripts\less.min.js' because it was modified.
Skipping 'Content\less\accordion.less' because it was modified.
Skipping 'Content\less\bootstrap.less' because it was modified.
...

Following that, the installation of the new package fails to update those files:

'Scripts\less.min.js' already exists. Skipping...
'Content\less\accordion.less' already exists. Skipping...
'Content\less\bootstrap.less' already exists. Skipping...
...

Maybe you did modify a few of those files - i.e. 'variables.less' which is meant to be customized - and that is the expected behavior for that file. But for the rest of them, which you never intended to modify, the development workflow is definitely not working.

Lets try to isolate the problem from a fresh start. Assuming you don't have a certain package in your solution (nor any of its files), if you open the Package Manager Console and do:

PM> Install-Package Twitter.Bootstrap.Less -Version 2.0.3

This will add a bunch of files, in this case .less and .js. After that, if you open a Git console and start tracking the new files you'll notice the following:

$ git add .
warning: LF will be replaced by CRLF in ... Scripts\less.min.js
warning: LF will be replaced by CRLF in ... Content\less\accordion.less
...

This smells like trouble... and it is.
If you further commit this changes and then checkout to another branch and then commit back to the current one (causing a refresh of your working folder):

$ git commit -m "Package added, v 2.0.3"
$ git checkout elsewhere
$ git checkout master

This lets your working directory with an altered copy of your original files (CRLFs instead of the original LFs). If you further try to update the package

PM> Update-Package Twitter.Bootstrap.Less -Version 2.0.4

You'll get the ugly warnings shown at the beginning of this post, and none of your files will be updated.
Obviously, this blind "safe behavior" that cannot tell your modified files from the rest is not what you want for your project.

Workaround

When the conflict has already gotten to this point, you have no choice other than:

Uninstall the package
Manually delete each of its remaining files - except for the ones you did willingly modify
Install the new version.

But you may prevent it from happening again on the next update. And this is where you should revisit the decision factors on the line-ending behavior of your Git repository.

The Right Way Then what?

In almost every current Windows developer tool, LF-ended lines are no longer a problem.

The shameful exception is Notepad - are you still using it for development tasks? Seriously? Come on...

Notepad++ is one among the many good replacements you can find out there.

In my opinion, if you decide to use certain js / css / less Nuget packages (built for Visual Studio and the Windows environment) because you trust their individual or collective authors, why wouldn't you abide to the line-ending conventions they have chosen?

At a quick glance, all of the components mentioned at the start of this post are using LF line-endings for their text files ( .js, .css, .less)

That's enough motive for me to use the same convention for those types of files. So, while keeping the CRLF line-ending for specific windows-generated text files (i.e. .sln and .csproj files), we are switching our Git repositories to not apply the CRLF to LF and reverse conversions for the extensions .js, .css and .less.

This is done with a .gitattributes file placed at the root of each repository - look at this help file for an explanation of the use of such file and its advantages. For the issue discussed in this post, this is all you need in your .gitattribute file:

# Set default behaviour, in case users don't have core.autocrlf set.
* text=auto

# Explicitly declare files we do not want to be converted 
*.js -text
*.css -text
*.less -text

And you are done! All those files will keep their original LF line-endings, both in your working folder and in the repository.

Going further, you may opt to enforce the conversion to LF line-endings for those file types even when you accidentally create a file of your own with CRLFs. This would result in better compliance with the "LF only" recommended practice for Git repositories.

To do that, change the .gitattributes file as follows:

# Set default behaviour, in case users don't have core.autocrlf set.
* text=auto

# Explicitly declare files handled as LF-ended (converting them if necessary) 
*.js text=input
*.css text=input
*.less text=input

WARNING: If you apply any of the preceding changes on an existing repository, action must be taken to avoid Git from detecting all affected files as modified on the next commit. You should follow the steps outlined on the Re-normalizing a repo section at the end of the previously mentioned help file.

Acknowledgments

This post from Tim Clem was very enlightening about the handling of line endings in Git. Thanks!

Wednesday, April 22, 2009

HTTP Error 500 in ASP.NET AJAX UpdatePanel

(a nasty combination)

When using ASP.NET 3.5 AJAX extensions in a web application, server errors that are thrown within an UpdatePanel request are hard to debug - even "hard to visualize". All you get, if you are lucky (*) is something like this:

(*) NOTES:

The actual message may vary depending on your browser.
This option to debug will only show if you've selected "debug javascript" in the browser options, otherwise you'll only get an error icon somewhere in the screen.

Debugging with Visual Studio 2008

Assuming you have Visual Studio available, I'll show how to get to the actual server error information with the IDE debugger. If you do not, you may follow a similar procedure with the built-in IE 8.0 debugger, or another "javascript-enabled" debugger of your choice.

If the error is thrown when you are not running the web application in debug mode from the VS IDE, a message similar to the previous image will appear. Clear the "built-in debugger" checkbox, answer "Yes", and choose to debug in Visual Studio 2008.
If you've started the web application in debug mode from the VS IDE, you will be taken to the exception code directly (no message box)

In either case, you'll be looking at something like this:

After selecting "Break", you may get the actual HTML of the server error in two ways:

Paste the following expression in the Watch window:
_this._webRequest._executor._xmlHttpRequest.responseText

Select "_this._webRequest" in the code window, add it to the Watch window, and expand the expression as shown below.

In either case, the final step (in VS 2008) is to select the "HTML Visualizer" to look at the server response in a human-readable format.

Step by step instructions:

Add _this._webRequest to the Watch window

Expand the expression and select the HTML Visualizer

You'll get something like this:

Friday, April 25, 2008

Object Materialization

From the moment we included Linq to SQL in the Data Access Layer of our prototype applications, as a nice side-effect we started using POCOs for most entities.

Linq to SQL (as any ORM) enables the use of POCOs. POCO entities have many appeals, most of them based on Persistence Ignorance. We delegate the persistence responsibility to our Data Access Layer, where we have code to materialize (read from a data source) and persist (save to a data source) entities (as in-memory objects).

Object Materialization, when working with ADO.Net, means projecting Object Collections from DataReaders, populating the properties of a custom object with the fields of a data record.

Linq to SQL performs this task internally, but it's limited to Linq queries over MS SQL databases. Now that we have POCO entities, it would be nice to obtain them from different data sources, like *-SQL, ODBC, Excel spreadsheets, CSV, or anything implementing an IDataReader interface.

As Jose Marcenaro anticipated here, and here, we want to design a Data Access Layer were Linq to SQL and Enterprise Library DAAB can live in harmony, that means: transparently sharing the same entity model.

Our primary objective is a custom stored procedure invocation based on Enterprise Library, projecting the same class of entities living in Linq to SQL Dbml files. Enterprise Library (or pure ADO.Net) queries deliver DataReaders (or DataSets), so the piece that's missing here is Object Materialization.

Like an alchemist seeking for a process to turn lead into gold, I began my quest for a Linq compatible ADO.Net Object Materializing mechanism.

I'm gonna show 3 different attempts, and at the end of this post, you can download a simple benchmarking app with all the different mechanisms I tried on.

First Attempt, FieldInfo.SetValue()

Fist of all, I noticed that an object materializer is a necessary part of any Linq Provider, and found an "official sample" in the Matt Warren blog, in a series of posts about "Building an IQueryable Provider".

http://blogs.msdn.com/mattwar/archive/2007/07/31/linq-building-an-iqueryable-provider-part-ii.aspx (below the title "The Object Reader")

He shows a simple Object Reader (aka Materializer), described as:

"The job of the object reader is to turn the results of a SQL query into objects. I’m going to build a simple class that takes a DbDataReader and a type ‘T’ and I’ll make it implement IEnumerable<T>. There are no bells and whistles in this implementation. It will only work for writing into class fields via reflection. The names of the fields must match the names of the columns in the reader and the types must match whatever the DataReader thinks is the correct type."

Basically, what this Object Reader does is:

Use Reflection over the target object type, to obtain the collection of FieldInfos.
Map the names in the FieldInfos with the DataReader field names.
While iterating the DataReader, use the FieldInfo.SetValue() method to populate the new target object type instances.

When working with Reflection performance is the first we worry about. As he advices, this is not a real world implementation, the use of Reflection to set the field values resulted very expensive.

Just to make it more Linq-compatible, I modified this object reader to look at properties instead of fields, setting private fields when its specified in a ColumnAttribute, like this:

private int _OrderID; 

[Column(Storage="_OrderID", AutoSync=AutoSync.OnInsert, DbType="Int NOT NULL IDENTITY", IsPrimaryKey=true, IsDbGenerated=true)]
public int OrderID
{
      ...
}

This is what Linq to SQL does.

The performance remained almost unchanged, because the cost of the initial lookup of Property/FieldInfo is negligible compared to the (NumberOfFields * NumberOfRecords) SetValue invocations.

This option becomes extremely unperformant when reading more than 500 rows. Its intended for didactic purposes only.

Dynamic Translator Lambda Expression

My second attempt were the most fun and educational prototype I wrote about Linq. As the previous attempt showed, using Reflection to populate fields must be avoided, The most simple and performant way to do this job is:

    while (dataReader.Read()) {

        Pet pet = Translate(dataReader);

        yield return pet;

    }

    function Pet Translate(IDataRecord dr) {

        return new Pet {

            Id = (int)dataReader.GetValue(0),

            Name = (string)dataReader.GetValue(1),

            Birthdate = (DateTime)dataReader.GetValue(2)

        }

    }

But life isn't that easy, I don't know fields names and positions until runtime, even further I want generic code, independent of the entity type (e.g. Pet).

Note that the Translate function above contains only one Object Initializer (new in C# 3.0), it could be wrote as a Lambda Expression (new in C# 3.0 too)

    Func<IDataRecord,Pet> Translate = (IDataRecord dr => new Pet {

            Id = (int)dataReader.GetValue(0),

            Name = (string)dataReader.GetValue(1),

            Birthdate = (DateTime)dataReader.GetValue(2)

        }

Again, this code can't be hardcoded, how can we create this Lambda Expression dynamically at runtime? with Expression Trees (yes, new in C# 3.0 too!)

In C# 3.0 we can programmatically build Expression Trees and compile them later into function delegates. We're going to use the Reflection info to build the above Func<IDataRecord,*>. Once it's compiled is (almost) as fast as directly getting values from the DataReader as shown above.

The code looks a little scary because it uses (and abuses of) Linq to Objects over the PropertyInfo collection to build the Expression Tree: it's like "Linq to Linq". I found big Lambda Expressions a little difficult to indent (don't worry you can download the source files at the end :))

using System; 
using System.Collections.Generic; 
using System.Linq; 
using System.Text; 
using System.Data; 
using System.Reflection; 
using System.Linq.Expressions; 
namespace ObjectMaterializer 
{ 
    public static class TranslatorBuilder 
    {

/// <summary> 
/// Dynamically creates a Translator Lambda Expression to project T instances from the IDataRecord 
/// </summary> 
/// <typeparam name="T">The projected type</typeparam> 
/// <param name="record">A source data record</param> 
/// <returns></returns> 
public static Expression<Func<IDataRecord, T>> CreateTranslator<T>(IDataRecord record) 
{ 
    // get properties info from the output type 
    Dictionary<string, PropertyInfo> propInfos = typeof(T).GetProperties().ToDictionary(pi => pi.Name); 

    // get field names in the DataRecord 
    var fieldMapping = new Dictionary<int, PropertyInfo>(); 
    for (int i = 0; i < record.FieldCount; i++) 
    { 
        string name = record.GetName(i); 
        if (propInfos.ContainsKey(name)) 
            fieldMapping[i] = propInfos[name]; 
    } 

    // prepare method info to invoke GetValue and IsDBNull on the IDataRecord 
    MethodInfo rdrGetValue = typeof(IDataRecord).GetMethod("GetValue"); 
    MethodInfo rdrIsDBNull = typeof(IDataRecord).GetMethod("IsDBNull"); 
    // prepare reference to the IDataRecord rdr parameter 
    ParameterExpression rdrRef = Expression.Parameter(typeof(IDataRecord), "rdr"); 

    /** builds the translator Lambda Expression 
     * 
     * assing each property to its matching field, e.g.: 
     *   new T { 
     *          PropertyName1 = (ProperCast1)rdr.GetValue(ordinal1), 
     *          PropertyName2 = rdr.IdDbNull(ordinal2) ? (ProperCast2)null : (ProperCast2)rdr.GetValue(ordinal2), 
     *          ... 
     *     } 
     * 
     * Note that null values on non-nullable properties will throw an Exception on assignment 
     * 
     * **/ 
    Expression<Func<IDataRecord, T>> proj = (Expression.Lambda<Func<IDataRecord, T>>( 
         Expression<T>.MemberInit(Expression<T>.New(typeof(T)), 
            fieldMapping 
                .Select(fm => 
                    Expression.Bind(fm.Value, 

                            ((!fm.Value.PropertyType.IsValueType) || (fm.Value.PropertyType.IsGenericType && fm.Value.PropertyType.GetGenericTypeDefinition() == typeof(Nullable<>))) ? 
                        //accepts nulls, test IsDbNull 
                                  (Expression)Expression.Condition(Expression<bool>.Call(rdrRef, rdrIsDBNull, Expression<int>.Constant(fm.Key)), 
                                             Expression.Convert(Expression.Constant(null), fm.Value.PropertyType) // value is System.DbNull, assign null 
                                        , 
                                           Expression.Convert( 
                                                (fm.Value.PropertyType == typeof(System.Data.Linq.Binary)) ? // convert byte[] to System.Data.Linq.Binary 

                                                    (Expression)Expression.New(typeof(System.Data.Linq.Binary).GetConstructor(new Type[] { typeof(byte[]) }), 
                                                        Expression.Convert(Expression.Call(rdrRef, rdrGetValue, Expression<int>.Constant(fm.Key)), typeof(byte[]))) 

                                                : 
                                                   (Expression)Expression.Call(rdrRef, rdrGetValue, Expression<int>.Constant(fm.Key)) // value is not-null, assign 
                                           , fm.Value.PropertyType) 
                                    ) 

                            : 
                        // doesn't accept nulls, direct assign 
                                (Expression)Expression.Convert(Expression.Call(rdrRef, rdrGetValue, Expression<int>.Constant(fm.Key)), fm.Value.PropertyType) 

                     ) as MemberBinding 
                 ) 
        ) 
        , rdrRef 
    )); 

            return proj; 
        } 

    } 
}

Note that timestamps, returned as byte[] by ADO.Net, are transformed into System.Linq.Binary by Linq to SQL, I added support for that.

Now we can use this TranslatorBuilder like this:

// generic method
public IEnumerable<T> ReadAllObjects<T>(IDataReader reader){

    Expression<Func<IDataRecord,T>> translatorExpression = TranslatorBuilder.CreateTranslator<T>(reader);
    Func<IDataRecord,T> translator = translatorExpression.Compile();

    while (reader.Read())
    {
        T instance = translator(reader);
        yield return instance;
    }
}
public IEnumerable<Pet> ReadAllPets(IDataReader reader) {
    return ReadAllObjects<Pet>(reader);
}

This is pretty elegant and performs great... but is not enough for us.

The Linq Object Materializer normally sets the private fields to avoid invoking the public property setters. This is not only to avoid a performance overhead, but because public property setters are often used for change tracking. With a Lambda Expression (or any C# expression) we can't access private fields.

Here I almost surrendered. How can I set private fields without using Reflection?

If Microsoft guys can, we can!, my boss (Jose Marcenaro) told me about an advanced and mysterious .Net feature, brought with the .Net 2.0 Framework: LCG, Lightweight Code Generation.

Googling around I found that LCG is what the Linq to SQL team used to build their Object Materializer, used on the DataContext.Translate() function.

Wait!, Why not just use the DataContext.Translate() function?, because:

It works only for Microsoft SQL Server databases
It requires an open db connection as a parameter
It requires .Net 3.5 (LCG is in .Net 2.0). Of course using an Object Materializer in .Net 2.0 if you don't have Linq to Objects may not sound so interesting.

Lightweight Code Generation

Since .Net 2.0 under the namespace System.Reflection.Emit are a couple classes that allow to programmatically generate dynamic methods from MSIL (Microsoft Intermediate Language) instructions. It's like adding at runtime a little piece of pre-compiled code.

Using this, we can build at runtime fast methods to set or get a field or property (even private ones). Here you may think "IL instructions??? I don't want to learn a low-level programming language for this!!". Relax, you only need 5 MSIL instructions.

Here's a helping class that generates a field set:

using System; 
using System.Collections.Generic; 
using System.Text; 
using System.Reflection; 
using System.Reflection.Emit; 

namespace ObjectMaterializer 
{ 
    public static class AccessorBuilder 
    {

    public delegate void MemberSet<T>(T obj, object value);

    public static MemberSet<T> CreateFieldSet<T>(FieldInfo fi) 
    { 
        Type type = typeof(T); 
    
        DynamicMethod dm = new DynamicMethod("Set" + fi.Name, null, new Type[] { type, typeof(object) }, type); 

        ILGenerator il = dm.GetILGenerator(); 
        // load the target object instance (argument 0) in the stack 
        il.Emit(OpCodes.Ldarg_0); 

        // load the new value (argument 1) in the stack 
        il.Emit(OpCodes.Ldarg_1); 

        if (fi.FieldType.IsValueType) 
            // if field contains a value type, we need to unbox it 
            il.Emit(OpCodes.Unbox_Any, fi.FieldType); 
        else 
            // if field contains a non-value type, we need to cast it 
            il.Emit(OpCodes.Castclass, fi.FieldType); 

        // set fi object's field value from the stack 
        il.Emit(OpCodes.Stfld, fi); 

        // return the value on the top of the stack 
        il.Emit(OpCodes.Ret); 

        return (MemberSet<T>)dm.CreateDelegate(typeof(MemberSet<T>)); 

    } 

    }

}

Lightweight Generated Code, and MSIL are advanced subjects, you can find a lot of samples googling around.

Using this field setters, I improved my first attempt, replacing FieldInfo.SetValue() with this dynamically generated methods, which once compiled into delegates perform as fast as conventional methods.

Later, I added a Field Setters Cache, to avoid building this dynamic methods again on every query.

Some benchmarking shows that this approach is (almost) as fast as the DataContext.Translate() function. There's a small performance difference yet, Why? If anyone can tell me, I'll be glad to update this post! :)

Anyway, our primary objective (in bold at the very beginning of this post) is achieved!!!, (it wasn't to beat the Linq to SQL Object Materializer performance).

The code

I put all these mechanisms in a simple benchmarking app that you can download here

First Attempt (using FieldInfo.SetValue): SimpleObjectReader.cs
Dynamic Translator Lambda Expression: TranslatorObjectReader.cs
LCG setters: LinqObjectReader.cs
re-using the DataContext.Translate() function: LTSObjectReader.cs

To run this you will need VisualStudio 2008, .Net 3.5 Framework and a Northwind db, which connection string you can set in the app.config.

Subscribe to: Posts ( Atom )