Wednesday, December 19, 2007

Generic Custom NHibernate Collections - A Second Swing

I talked about custom collections for WPF and NHibernate back here, but I wanted to mention that I made an alternative solution that has less lines of code and it is apparently easier for other people to understand.

A quick recap: We want to harness the powerful databinding features of WPF. To optimize the two-way binding functionality, our objects need to implement INotifyPropertyChanged. No problem there, but our collections need to implement INotifyCollectionChanged, which is problematic, because our collections are commonly IList(T)s.

Why does NHibernate use an IList? When declaring a transient (new) object, we always write something such as the following:

private IList InnerType m_InnerItems = new List<InnerType>();

A "transient" collection is a new or unsaved collection that was created in your code, and the concrete implementation of "IList(T)" is a "List(T)". A persistent (saved) object is not built by your code, it is built by NHibernate. It is still an "IList(T)", but the concrete implementation is a PersistentGenericBag(T).

The PersistentGenericBag class has no default constructor, it requires an ISession as a construction parameter to support the "Lazy-Loading" magic. Since PersistentGenericBag has no default constructor, it wasn't designed for us to use in transient collections. Besides, why would we want to use an NHibernate-implementation-specific type inside of our domain objects? That would couple our domain objects too tightly with NHibernate specific implementation, in my opinion.

What to do? We need to make a new interface (I defined mine to implement INotifyCollectionChanged for my uses, but this could implement anything you need for your purposes):

using System.Collections.Generic;
using System.Collections.Specialized;

namespace NotifyingCollectionDemo.Library.Collections
{
public interface IDomainCollection<T>:INotifyCollectionChanged, IList<T>
{
}
}

We need to define a new "Transient" collection type for our interface:

using System.Collections.Generic;
using System.Collections.Specialized;

namespace NotifyingCollectionDemo.Library.Collections
{
public class TransientDomainCollection<T>:List<T>, IDomainCollection<T>
{
#region INotifyCollectionChanged Members
public event NotifyCollectionChangedEventHandler CollectionChanged;
/// <summary>
/// Fires the <see cref="CollectionChanged"/> event to indicate an item has been
/// added to the end of the collection.
/// </summary>
/// <param name="item">Item added to the collection.</param>
protected void OnItemAdded(T item)
{
if (this.CollectionChanged != null)
{
this.CollectionChanged(this, new NotifyCollectionChangedEventArgs(
NotifyCollectionChangedAction.Add, item, this.Count - 1));
}
}
/// <summary>
/// Fires the <see cref="CollectionChanged"/> event to indicate the collection
/// has been reset. This is used when the collection has been cleared or
/// entirely replaced.
/// </summary>
protected void OnCollectionReset()
{
if (this.CollectionChanged != null)
{
this.CollectionChanged(this, new NotifyCollectionChangedEventArgs(
NotifyCollectionChangedAction.Reset));
}
}
/// <summary>
/// Fires the <see cref="CollectionChanged"/> event to indicate an item has
/// been inserted into the collection at the specified index.
/// </summary>
/// <param name="index">Index the item has been inserted at.</param>
/// <param name="item">Item inserted into the collection.</param>
protected void OnItemInserted(int index, T item)
{
if (this.CollectionChanged != null)
{
this.CollectionChanged(this, new NotifyCollectionChangedEventArgs(
NotifyCollectionChangedAction.Add, item, index));
}
}
/// <summary>
/// Fires the <see cref="CollectionChanged"/> event to indicate an item has
/// been removed from the collection at the specified index.
/// </summary>
/// <param name="item">Item removed from the collection.</param>
/// <param name="index">Index the item has been removed from.</param>
protected void OnItemRemoved(T item, int index)
{
if (this.CollectionChanged != null)
{
this.CollectionChanged(this, new NotifyCollectionChangedEventArgs(
NotifyCollectionChangedAction.Remove, item, index));
}
}
#endregion

/// <summary>
/// we need to re-implement the IList methods to support observability
/// </summary>
/// <param name="item"></param>
#region IList<T> members
public new void Add(T item)
{
base.Add(item);
this.OnItemAdded(item);
}
public new void Clear()
{
base.Clear();
this.OnCollectionReset();
}
public new void Insert(int index, T item)
{
base.Insert(index, item);
this.OnItemInserted(index, item);
}
public new bool Remove(T item)
{
int index = this.IndexOf(item);
bool result = base.Remove(item);
this.OnItemRemoved(item, index);
return result;
}
public new void RemoveAt(int index)
{
T item = this[index];
base.RemoveAt(index);
this.OnItemRemoved(item, index);
}
#endregion
}
}

We need to define a new "Persistent" collection type for our interface:

using System.Collections.Generic;
using System.Collections.Specialized;
using NHibernate.Collection.Generic;
using NHibernate.Engine;

namespace NotifyingCollectionDemo.Library.Collections
{
public class PersistentDomainCollection<T>:PersistentGenericBag<T>, IDomainCollection<T>
{
#region constructors
public PersistentDomainCollection(ISessionImplementor session, IList<T> coll) : base(session, coll)
{
}
public PersistentDomainCollection(ISessionImplementor session) : base(session)
{
}
#endregion

#region INotifyCollectionChanged Members
public event NotifyCollectionChangedEventHandler CollectionChanged;
/// <summary>
/// Fires the <see cref="CollectionChanged"/> event to indicate an item has been
/// added to the end of the collection.
/// </summary>
/// <param name="item">Item added to the collection.</param>
protected void OnItemAdded(T item)
{
if (this.CollectionChanged != null)
{
this.CollectionChanged(this, new NotifyCollectionChangedEventArgs(
NotifyCollectionChangedAction.Add, item, this.Count - 1));
}
}
/// <summary>
/// Fires the <see cref="CollectionChanged"/> event to indicate the collection
/// has been reset. This is used when the collection has been cleared or
/// entirely replaced.
/// </summary>
protected void OnCollectionReset()
{
if (this.CollectionChanged != null)
{
this.CollectionChanged(this, new NotifyCollectionChangedEventArgs(
NotifyCollectionChangedAction.Reset));
}
}
/// <summary>
/// Fires the <see cref="CollectionChanged"/> event to indicate an item has
/// been inserted into the collection at the specified index.
/// </summary>
/// <param name="index">Index the item has been inserted at.</param>
/// <param name="item">Item inserted into the collection.</param>
protected void OnItemInserted(int index, T item)
{
if (this.CollectionChanged != null)
{
this.CollectionChanged(this, new NotifyCollectionChangedEventArgs(
NotifyCollectionChangedAction.Add, item, index));
}
}
/// <summary>
/// Fires the <see cref="CollectionChanged"/> event to indicate an item has
/// been removed from the collection at the specified index.
/// </summary>
/// <param name="item">Item removed from the collection.</param>
/// <param name="index">Index the item has been removed from.</param>
protected void OnItemRemoved(T item, int index)
{
if (this.CollectionChanged != null)
{
this.CollectionChanged(this, new NotifyCollectionChangedEventArgs(
NotifyCollectionChangedAction.Remove, item, index));
}
}
#endregion

/// <summary>
/// we need to re-implement the IList methods to support observability
/// </summary>
/// <param name="item"></param>
#region IList<T> members
public new void Add(T item)
{
base.Add(item);
this.OnItemAdded(item);
}
public new void Clear()
{
base.Clear();
this.OnCollectionReset();
}
public new void Insert(int index, T item)
{
base.Insert(index, item);
this.OnItemInserted(index, item);
}
public new bool Remove(T item)
{
int index = this.IndexOf(item);
bool result = base.Remove(item);
this.OnItemRemoved(item, index);
return result;
}
public new void RemoveAt(int index)
{
T item = this[index];
base.RemoveAt(index);
this.OnItemRemoved(item, index);
}
#endregion
}
}

Finally, we need an implementation of IUserCollectionType to tie this all together and use it in the mapping files. Notice how I treat this as a factory class:

using System.Collections;
using System.Collections.Generic;
using NHibernate.Collection;
using NHibernate.Engine;
using NHibernate.Persister.Collection;
using NHibernate.UserTypes;

namespace NotifyingCollectionDemo.Library.Collections
{
public class DomainCollectionFactory<T> :IUserCollectionType
{
#region IUserCollectionType Members
public IPersistentCollection Instantiate(ISessionImplementor session, ICollectionPersister persister)
{
return new PersistentDomainCollection<T>(session);
}
public IPersistentCollection Wrap(ISessionImplementor session, object collection)
{
return new PersistentDomainCollection<T>(session,collection as IList<T>);
}
public object Instantiate()
{
return new TransientDomainCollection<T>();
}
public IEnumerable GetElements(object collection)
{
return (IEnumerable) collection;
}
public bool Contains(object collection, object entity)
{
return ((IList) collection).Contains(entity);
}
public object IndexOf(object collection, object entity)
{
return ((IList) collection).IndexOf(entity);
}
public object ReplaceElements(object original, object target, ICollectionPersister persister,
object owner, IDictionary copyCache, ISessionImplementor session)
{
IList result = (IList) target;
result.Clear();
foreach (object o in ((IEnumerable) original))
{
result.Add(o);
}
return result;
}
#endregion
}
}

How to use this? In your mapping file, something such as:

<bag name="Items" inverse="true" cascade="all-delete-orphan" generic="true" lazy="true"
collection-type=
"NotifyingCollectionDemo.Library.Collections.DomainCollectionFactory`1[[NotifyingCollectionDemo.Library.DomainModel.ListItem, NotifyingCollectionDemo.Library]], NotifyingCollectionDemo.Library">
<key column="ListContainerID" />
<one-to-many class="ListItem" />
</bag>

In the code:

private IDomainCollection<ListItem> _items = new TransientDomainCollection<ListItem>();

public IDomainCollection<ListItem> Items
{
get { return this._items; }
set { this._items = value; }
}

And you should be in business.
I like this code, because the NHibernate-specific stuff is only accessible from the NHibernate-specific factory. The user code never references a PersistentDomainCollection, which makes for a clean cut. Again thanks to Billy McCafferty and Damon Carr, since my solutions are "cannibalizations" of their more original works. Any thoughts?

Labels: ,

Tuesday, December 18, 2007

Feature Complexity vs. Feature Value

Sometimes, as engineers we become guilty of pouring large amounts of time and energy into a feature that yields little or no value to our users in the long run.

Sometimes, as engineers we overlook some trivial detail in usability that would have made a world of difference to someone who needs to use it for a living.

I'm no usability expert, but I really like watching someone using my program:
  • I watch the way their eyes scan the screen when it first appears
  • Is there any repetitive motions?
  • Does the user seem to be hitting dead ends anywhere? Are they searching for some button or display feature that doesn't exist?
One of my clients was a very slow typist, and he was mentioning some occasions where he lost his work on a web form due to session expiration. Instead of extending the session timeout, I added these four lines of code to the HEAD tag of his web form:
       function AlertUserOfTimeout()
{
alert("You will be logged off of this page in ten minutes, save your changes while you can!");
}
window.setTimeout("AlertUserOfTimeout()",<%=(Session.Timeout-10)*60*1000%>);
Basically ten minutes before the server disposes of his session, he has a popup window reminding him to save his work.
Today he told me that these four lines of code saved him hours of work! Now that is some code with real value!

Sunday, December 09, 2007

Linq to SQL vs NHibernate Part 1: What do they have in common?

Choosing a technology such as object persistence is one of the first steps in any major project, and it's a tough call to make. We spent some time at my company trying to figure out if Linq to sql was a better ORM than NHibernate. After some experiments, I came to the engineer's conclusion: It depends. As Linq gains popularity, people will be wondering the same questions, so I'm writing a few unbiased posts to sort out their differences (just as a warning in advance, my expertise is with NHibernate)

Crash Course:
Linq is the query syntax added to the C# language for 3.0. Trees, relational data, objects, xml etc can all be queried using the common Linq syntax reminiscent of SQL. It is easy and flexible, strongly typed, and compiled. Linq to SQL is a natural extension of Linq into an ORM, and it is touted as a "lightweight" data mapper, and heavily hyped by microsoft in previous beta versions. Linq to SQL is built specifically for sql server 2005 and above.

NHibernate is a mature open source project designed specifically to solve ORM problems. It is an extremely flexible and configurable ORM, and its been battle-proven for many enterprise projects. It is database agnostic, and supports a wide array of different database brands. Like many active open-source projects, it is undergoing constant evolution, which makes good documentation hard to find.

What do they have in common?

Mapping syntax
Any object/data mapping system is going to need object definitions and and a corresponding DDL. Both libraries are extremely flexible with their initial configuration, but there is a way to use both of them in a similar fashion.

Just like any good ORM, your objects are simply plain old objects that happen to be persistable as an afterthought. The object definition is any class file. The XML mapping gives the ORM library the links between objects and their tables. It defines the mappings between an objects properties and the columns in the database. It defines the relationship between objects (collections and encapsulation) and the corresponding data relations (many-to-many, many-to-one, one-to-one)

Both NHibernate and Linq accept these mapping files as arguments to their initialization.
Persistent Object Lifecycles
Scoping - Within a common scope, the loading of two instances of the same database row should yield two references to the same object. This scope in NHibernate is called a "Session", in Linq, it is called a "DataContext", and both guarantee reference equality between two instances of the same data under the same scope.

Version Management - Once you have loaded objects under a scope, you should be able to efficiently synchronize the database with the updated state of your objects. Linq's DataContext exposes a SubmitChanges() method for this very purpose, and NHibernate has a Flush() method.

Adjustable Fetching Schemes - Loading objects from the database is a bit of a catch-22, you don't want to load the entire database up in to an object graph, but you don't want a roundtrip to the database every time you need a new object. Both of the libraries support highly configurable lazy and eager fetching schemes. Both of them use lazy loading by default, and both use left-outer-join as a default behavior when eagerly fetching peripheral objects.

Concurrency Concerns - Enterprise data is volatile, and we need an ability to recognize and manage the scenarios when data is changed by external forces. By default behavior, NHibernate and Linq behave in an optimistic concurrency fashion, which basically loads rows without locking them, and throws exceptions if the objects you are saving have changed since you loaded them. Both libraries have multiple means of customizing concurrency behavior.

Custom Database Objects - There are some operations that are simply better off left to the database to perform, such as large scale "en-masse" updates and reporting. These operations are easily implemented as stored procedures or indexed views. Both libraries support the ability to interface with custom database objects. Linq has very strong integration with stored procedures and views, but it only works with sql server 2005 and above. NHibernate is database agnostic, but there code that references database objects is string-based, which makes the connection brittle in comparison.

Code Generators
Personally, I am not a fan of any code generator related to something as important as your DDL, but there seems to be a very big demand for code generation, and there are convenience tools for both ORMs.
Linq Visual Designer - Linq comes with a built-in Visual studio designer for Linq Objects. It looks just like the visual dataset designer, because it was built by the same guy who made the visual dataset designer. IMO, this designer is a great way to get you nowhere in 30 seconds. It is only useful for the most trivial of object graph complexities, it uses partial classes to separate the mapping code from your user code, and it is brittle code at best.
Linq SQLMetal - In terms of codegens, this is Linq's saving grace right here. Given a database connection, sqlmetal can generate clean code for the objects, mappings, or both, with an array of options for fine-tuning the output code.
MyGeneration - A free 3rd-party codegen that has DDL "templates" for both NHibernate and Linq (amnong many others). This is a great way to generate code if you have an existing database schema.
NHibernate SchemaExport - All of the codegens above deal with the conversion of an existing database schema into object and mapping code. SchemaExport goes in the other direction, building a database schema from the mappings. I spoke on SchemaExport in the past, I am a very big fan of this one.

Integration -
Since the two technologies are not necessarily in direct competition, the is currently a push to harness the power of the linq-style querying in to nHibernate. More about Linq for NHibernate can be found here, here and here.

This post covers some of the commonalities, and in the next few days, I'll be comparing some of the more important factors such as performance, flexibility, and usability.


UPDATE:
PerpetuumSoft is a 3rd-party company has filled the dire need for a database synchronization tool with Linq To Sql. Given your object model definitions in Lint to SQL, their Database Restyle application is a royalty-free component that gives you the essential ability to synchronize a schema from a changing object design, so you can design from a truly object-centric point of view.

Labels: , ,

Friday, December 07, 2007

C# Perversion: Generic Constructors using Reflection

This MIGHT be useful in the future, but right now I was getting my hands dirty with some reflection code and it intrigued me: generic constructors.
        public AnyType DefaultConstructor<AnyType>()
{
return ExtensibleConstructor<AnyType>(new Type[] {}, new object[] {});
}
public AnyType ExtensibleConstructor<AnyType>(Type[] argTypes, object[] args)
{
Debug.Assert(argTypes.Length == args.Length, "Constructor argument lengths must match");
Type theType = typeof (AnyType);
ConstructorInfo theConstructor =
theType.GetConstructor(argTypes);
Debug.Assert(theConstructor != null,
"This type of class [" + theType + "] doesnt have a matching constructor signature");
return (AnyType) theConstructor.Invoke(args);
}

This code will indirectly invoke any publicly exposed constructor of AnyClass

public void Test()
{
StringBuilder _stringBuilder = DefaultConstructor<StringBuilder>();
_stringBuilder = ExtensibleConstructor<StringBuilder>(new Type[] {typeof (string)}, new object[] {""});
}
I understand the reflection concept, but I am one of those guys that uses it as a last resort, not because of the performance risk, but because I fear it can obfuscate the code.

Labels:

Thursday, December 06, 2007

Unit testing Persistent objects in ORMs such as NHibernate

Even though I am writing this with NHibernate-specific examples, these concepts apply to any ORM technology, so use your imagination a little on this one.

So you're using persistent classes, and you need to make unit tests. Here is a fundamental concern to test with any persistent class:

I want to create an object, then "persist" it. I want to test that the object loaded from persistence matches the object I originally created. Then I have tested the correctness of the mappings.

How do you test to make sure that two objects match each other? I see two concerns:
  1. The immediate values inside of the object are matching (primitive properties, for example)
  2. The "neighbors" that my class has references to also match (many-to-one or one-to-many associated classes).
Consider this scenario: You have a target class X, and you want to test its persistence. Imagine that X inherits from an abstract class such as PersistentObject:
    public abstract class PersistentObject<T>
{
#region members
private int id = 0;
#endregion

#region properties
public int Id
{
get { return id; }
}
public bool IsSaved
{
get { return id != 0; }
}
#endregion

#region methods
public abstract bool Matches(T t);
#endregion
}
This is a very simplistic example, but in this case every PersistentObject has an integer identity primary key. Anything that inherits from PersistentObject must also implement a "Matches" method, which compares objects of a common type to see if the properties match.
I could have overridden the object.Equals method here, but I feel this would obfuscate the meaning of "Equals", so I made a new method.

This "Matches" method is the perfect hook to test the equality of immediate properties (persistent object testing concern #1).

Just because the immediate properties of an object match, does this mean that the objects are the same? No, we must make sure that the "neighboring" objects also match.

For concern #2, we could use a recursive approach, call the "Matches" method recursively on all of the parents and all of the children? I argue this is overkill.

If the target object's parents' IDs are matching, and the target object's childrens' IDs collections are matching, then we have effectively proven that the objects match.

If you have a test for every type of persistent object, then there is no need to call a "Matches" method recursively, because you will be testing the same thing many times over, and the code will be unnecessarily complex.

One important quality of NHibernate to remember: NHibernate guaranteess the reference equality between two objects representing the same row under the same session. This is a very simple, but powerful quality.

Here is the approach I've adopted for unit testing persistence:
  1. Create your database schema from the mappings using the schema export tool.
  2. Use a factory to create some bulk data. If you can, create a rich graph of all of your persistent objects in a realistic scenario.
  3. Using a new session, persist this object graph to the database.
  4. For each type of persistent object,
  • try to load a new copy of the object under a brand new session using the ID of one of your originally created objects.
  • Call the "Matches" method to compare the two objects. (concern #1)
  • Verify that the IDs of the parents/children of the original object equal the IDs of the parents/children of the loaded objects (concern #2)
Finally, tear down the database at the end.

Labels: ,

Monday, December 03, 2007

Top three "DUH" features of SQL 2008...

I'm very excited to get my hands dirty with SQL 2008.

Don't get me wrong, there is much to love with this current 2005 version, but some of these features I hear coming down the pipeline have me wondering why we didn't do things like this in the first place.
  1. The MERGE command. Is this data a new row, or is it changes to an existing row? How many times do we write SaveOrUpdate(..) methods in our code with this "IF its new, insert it, otherwise update it" code? This line of thinking became a subconscious action, and we were used to writing this logic over and over. Merge allows you to synchronize two sets of data in an "en masse" approach. Suppose you have a dataset, xml, or some other input called "src" that is the data you want to merge into a table called "dest":
    MERGE dest USING src
    ON (src.key = dest.key)
    WHEN MATCHED THEN
    UPDATE [all dest values to update... regular update command here]
    WHEN NOT MATCHED THEN
    INSERT [a simple insert command from src to dest!]
    Cmon tell me that is not cool! You can wrap this in a transaction and a try..catch block to make it better. Duh! We needed this years ago!
  2. Heirachical Data. Tree-structured data is a fact of life. When you need it , you need it, but SQL server support for this scenario has been very shaky in the past, and everyone needs to get creative to store trees in the relational database. Some ways are compliacted, some ways are non-performant, and all of them expose some ugly compromise. Finally, sql 2008 supports clustered indexing in a hierarchical fashion (breadth-first or depth-first ordering). Consider this table:
    CREATE TABLE ParentChildOrg
    (
    EmployeeID int PRIMARY KEY,
    ManagerId int REFERENCES ParentChildOrg(EmployeeID),
    EmployeeName nvarchar(50)
    ) ;
    GO
    Using this definition, here is some sample code from the docs that runs an optimized insert:
    DECLARE @Manager hierarchyid
    SELECT @Manager = CAST('/3/1/' AS hierarchyid)
    INSERT HumanResources.EmployeeDemo (OrgNode, LoginID, Title, HireDate)
    VALUES
    (@Manager.GetDescendant(NULL, NULL),
    'adventure-works\FirstNewEmployee', 'Application Intern', '3/11/07') ;
    Granted, I'm still trying to learn the good and bad things that come with the syntax of the hierarchyId data type, this is much needed support for sql server. Is there a need for native support for trees? duh!
  3. Native file support for BLOBS. I have learned some serious hard lessons about this one in the past. You need to store blobs, if you store them directly in your database, then you are setting yourself up for a scalability nightmare scenario. If you are storing URLs to the blob data, then you need to manually synchronize your data with sql server. Finally, our 2008 version allows us to store blobs as an INTERNAL file structure to sql server, while exposing them as filestreams and not large unwieldy hunks of heavy data that wreak your I/O. We've all been treating sql server like a file server in the past, because we need it to serve files, not just chunks of data. DUH!
Its like a cell phone, or email... I mean, once you have it, you ponder how you ever got anything done without it, and those are the signs of Good ideas.

Labels: