Architecture confusion with ORM
I read the most interesting post on this morning regarding ActiveRecord vs Objects written by Bob Martin. It was linked on InfoQ by Sadek Drobi. I personally think Bob was pretty far off the mark with his article, but I think I understand how to explain the errors in his conclusion.
First, some context. ActiveRecord is a design pattern by Martin Fowler which suggests how to correctly leverage ORM. Every object knows how to save, update, delete itself, etc. In the .NET realm, Castle ActiveRecord is a very compelling option to consider for persistence technology. As of the other week, it has been built on top of NHibernate 1.2, which means it is very fast, flexible, and supports generics. I'm currently playing around with ActiveRecord to build a sample app for an upcoming ACM lecture at UNH.
Anyways, I think I understand where Bob Martin is coming from, I once had a similar confusion, and I would like to clarify the incorrect points of this article. For starters, lets get to the heart of the problem:
The Active Record pattern is a way to map database rows to objects.
This statement is true. There is a dichotomy between the relational structures and the object oriented structures. Relational structures are stacks upon stacks of data, while objects can have intelligence and encapsulate the data. This is where the confusion begins:
From the beginning of OO we learned that the data in an object should be hidden, and the public interface should be methods.
In other words: objects export behavior, not data.
An object has hidden data and exposed behavior.
This is bizarre. Since when can't objects expose data AND methods?
In languages like C++ and C# the struct keyword is used to describe a data structure
with public fields.
If there are any methods, they are typically navigational.
They don’t contain business rules.
So, according to Bob, none of your data structures can have any business-logic intelligence.
Thus, data structures and objects are diametrically opposed.
They are virtual opposites.
One exposes behavior and hides data, the other exposes data and has no behavior.
OK now this sounds strange to me, but I think I understand where Bob was coming from when he came to this conclusion. In any data-oriented architecture you have the following inevitable concerns that need to be addressed:
In this UML diagram, the folders represent assemblies, and the arrows represent dependent references. Remember we cannot have two assemblies depend on each other, right?
The first assembly defines the persistent objects. The second assembly encapsulates the data access logistics. Since it depends on the object definition, it references the first assembly. The third assembly handles the business logic. It will need object definition and persistence functionality, so it depends on the other two assemblies.
Using this design, your objects truly become nothing but dumb data structures. This is what I think Bob was talking about. This should smell like an antipattern to any designer.
OK, we want our objects to have intelligence. Lets take a second swing. Imagine the business logic is inside of the persistent objects, but the data layer is in its own assembly.

Since the data layer depends on the object layer and vice versa, we will have to use some fancier tricks here. Consider defining the data layer interface inside of the entity layer, such that the entity layer can rely on an interface. Then, the data layer will implement the interfaces defined on the entity layer (hence the solid upward arrow). At runtime, the dependency on the data layer can be dynamically bound using dependency injection (hence the dotted arrow). The implementation specifics can be found here in Billy Mccafferty's article on NHibernate best practices.
Now you have objects and their intelligence in the same place, and the data persistence implementation is cleanly separated, so persistence-related concerns don't start to encroach in to your business logic. Still, this is not good enough for me. These layers look nice on paper, but in practice, there is a lot of work to keep the interfaces and the implementation correctly aligned. I think that unless your application is monolithic or database agnostic, the cost of this layering outweighs the benefits.
The third approach is more of a free-for-all. Everything is defined within the same assembly so there are no dependencies. It then becomes your responsibility as a programmer to correctly encapsulate and abstract away persistence-oriented specifics from your business logic. Are you up to the challenge? Perhaps this scenario will not work for everybody, but it seems to be a fair compromise between risk mitigation and simplicity to me.
Architecture can be confusing with ORM, since objects are closely related to their persistence concerns, but it CAN work! Don't decouple like a drunken sailor: ask yourself: Layers and abstraction are cool, but is this extra vestige really helping me more than it is hurting me?
First, some context. ActiveRecord is a design pattern by Martin Fowler which suggests how to correctly leverage ORM. Every object knows how to save, update, delete itself, etc. In the .NET realm, Castle ActiveRecord is a very compelling option to consider for persistence technology. As of the other week, it has been built on top of NHibernate 1.2, which means it is very fast, flexible, and supports generics. I'm currently playing around with ActiveRecord to build a sample app for an upcoming ACM lecture at UNH.
Anyways, I think I understand where Bob Martin is coming from, I once had a similar confusion, and I would like to clarify the incorrect points of this article. For starters, lets get to the heart of the problem:
The Active Record pattern is a way to map database rows to objects.
This statement is true. There is a dichotomy between the relational structures and the object oriented structures. Relational structures are stacks upon stacks of data, while objects can have intelligence and encapsulate the data. This is where the confusion begins:
From the beginning of OO we learned that the data in an object should be hidden, and the public interface should be methods.
In other words: objects export behavior, not data.
An object has hidden data and exposed behavior.
This is bizarre. Since when can't objects expose data AND methods?
In languages like C++ and C# the struct keyword is used to describe a data structure
with public fields.
If there are any methods, they are typically navigational.
They don’t contain business rules.
So, according to Bob, none of your data structures can have any business-logic intelligence.
Thus, data structures and objects are diametrically opposed.
They are virtual opposites.
One exposes behavior and hides data, the other exposes data and has no behavior.
OK now this sounds strange to me, but I think I understand where Bob was coming from when he came to this conclusion. In any data-oriented architecture you have the following inevitable concerns that need to be addressed:
- Domain/Entity definition (Bob would call this the definition of the data structures... you need to define the persistent objects)
- Data Access definition (Where do we actually implement "Save" Delete" etc...)
- Business Intelligence (Where to we define the procedures that manipulate these entities/domain objects?)
In this UML diagram, the folders represent assemblies, and the arrows represent dependent references. Remember we cannot have two assemblies depend on each other, right?The first assembly defines the persistent objects. The second assembly encapsulates the data access logistics. Since it depends on the object definition, it references the first assembly. The third assembly handles the business logic. It will need object definition and persistence functionality, so it depends on the other two assemblies.
Using this design, your objects truly become nothing but dumb data structures. This is what I think Bob was talking about. This should smell like an antipattern to any designer.
OK, we want our objects to have intelligence. Lets take a second swing. Imagine the business logic is inside of the persistent objects, but the data layer is in its own assembly.

Since the data layer depends on the object layer and vice versa, we will have to use some fancier tricks here. Consider defining the data layer interface inside of the entity layer, such that the entity layer can rely on an interface. Then, the data layer will implement the interfaces defined on the entity layer (hence the solid upward arrow). At runtime, the dependency on the data layer can be dynamically bound using dependency injection (hence the dotted arrow). The implementation specifics can be found here in Billy Mccafferty's article on NHibernate best practices.
Now you have objects and their intelligence in the same place, and the data persistence implementation is cleanly separated, so persistence-related concerns don't start to encroach in to your business logic. Still, this is not good enough for me. These layers look nice on paper, but in practice, there is a lot of work to keep the interfaces and the implementation correctly aligned. I think that unless your application is monolithic or database agnostic, the cost of this layering outweighs the benefits.
The third approach is more of a free-for-all. Everything is defined within the same assembly so there are no dependencies. It then becomes your responsibility as a programmer to correctly encapsulate and abstract away persistence-oriented specifics from your business logic. Are you up to the challenge? Perhaps this scenario will not work for everybody, but it seems to be a fair compromise between risk mitigation and simplicity to me.Architecture can be confusing with ORM, since objects are closely related to their persistence concerns, but it CAN work! Don't decouple like a drunken sailor: ask yourself: Layers and abstraction are cool, but is this extra vestige really helping me more than it is hurting me?
Labels: NHibernate


5 Comments:
Yeah.
Interesting article, thanks for that. Currentyly, we have three problems implementing ActiveRecord in the way you descrived last (the free-for-all way):
- Inheritance: Thus far I didn't succeed in subclassing AR classes while still saving them in the same DB table. And no, I would not like to create different DB tables for each subclass (I like the free-for-all approach, thank you :) )
- Encapsulation: Did you ever try to make an AR property private? In other words, is it possible to save data in the DB that is only managed by the AR class itself? I didn't succeed in this also.
- Implementing Business rules (like validation) is also quite hard. Because when you call the Save function of an AR object, it is only performing data-validations, which are quite different then business-validations. And so we have to override every single Save function of AR, 'injecting' our own business-rules.
(ironic:on)And why not put the controller and the view in the same assembly? (ironic:off)
Ironic?
Celebrado, you bring up a very good point, lumping the controller and view logic together while separating the models is definitely the best solution 99% of the time.
Unless of course you need to reuse the controller logic across different view implementations (EG use the same controllers for a web-based version as well as a thick client.) In this case, some DI using an IOC container will solve all of your problems
This article was written before MVC.NET, but after Monorail and ruby on rails frameworks gained popularity.
Roelf:
Sorry for my late reply:
1. Over the past year or so, I've learned that the AcitveRecord pattern is an elegant solution for simple persistence models, and a repository pattern requires more code but proves to be effective when dealing with more complex structures.
2. I ALWAYS place my controller and persistence logic underneath interfaces, and use something such as StructureMap, Windsor, or custom service locators to put the pieces together. Once you get the hang of using DI, it is not painful, it enforces loose coupling, and eliminates all of the dependency dilemmas I've mentioned in the article.
Other notes: You can sub-class an AR class within the same table. I usually have an abstract parent, and multiple child classes. I use a discriminator column to determine the class type of any given row.
I have some example code if you would like me to send it to you.
Pay attention to the discriminator attribute on this page:http://www.castleproject.org/activerecord/documentation/trunk/usersguide/typehierarchy.html
You CAN make an AR property private, NHibernate supports this. The trick is to make the access modifier for that property set to "field" More information look at the access modifier here:
http://www.castleproject.org/activerecord/documentation/v1rc1/usersguide/generics.html
In terms of validation, I recommend looking at the castle validator component, which allows you to write your own validation method or choose from a stack of existing validators. More information:
http://hammett.castleproject.org/?p=114
again, my apologies on the slow response, if I can help any more, let me know
Post a Comment
<< Home