Software Factories - Another Unlearned EJB Lesson
Many software projects experience disappointing productivity. Often productivity problems are inherent to the way we build software. Developers often find themselves churning out unproductive, non-business related code as a consequence of inappropriate architectural choices. Common example is the use of verbose data transfer objects in applications that do not benefit from distribution, using cumbersome low-level APIs like ADO.NET and similar. These problems are inherent to our orthodox approach to software architecture, which often produces unnecessary complexity. There are solutions to many of these productivity problems, and anything we can do to remove them reduces project costs and improves our ability to focus on business problems.
A common answer to productivity issues is to argue that they are irrelevant because tools help to ignore them. Jeremy Miller recently described two different philosophies that are relevant to this. I belong to the camp of developers who strides to make development easier by using the right tools, frameworks and methodologies. The other camp seeks to do-away with programming and use tools to generate stuff for us. Software factories, at least the ones available today, are built around the latter philosophy. If you look away from the general, written guidance, current software factories, such as the Web Service Software Factory, are code generators.
"Increased productivity. It includes automated code and configuration file generation for Visual Studio 2005. With this automation, developers can easily apply guidance in consistent and repeatable ways. Developers can also effectively use the .NET Framework without having to devote significant time to learning the necessary APIs."
From Microsoft Patterns & Practices - Web Service Software Factory
I do not oppose code generation in general. For instance I use ReSharper templates a lot, but I'm addressing a different form of code generation here. ReSharper's code generators and templates are no different from using macros. You save typing and the generated code is hand maintainable. There are no round-tripping issues with this kind of code generation.
There are a number of issues with the software factory approach to code generation as a prime solution to productivity problems. Even if the architecture prescribed by the guidance is sound, it is very easy to implement bad architectures or bad code with code generation. The software factories only implement one form of an architecture. This might be good for middle of the road solutions, but it is often too much for simple projects and not enough for more complex scenarios. Some of the bad practices we see in generated code are code duplication and over engineering. Code duplication is a code smell which can be avoided by better design, such as abstracting common code into frameworks. The DRY principle is so important that we should observe it in our entire solution, not just in the code we write by hand.
Over engineering can be attributed to code generators having to cover advanced scenarios. For instance the Web Service Software Factory prescribes using a layer of Data Transfer Objects (DTOs) on top of the domain model. This is a often good approach, but in many scenarios using an exposed domain model is better. Most of the services I see generally have an internal structure that maps directly to the service interface. The reason for prescribing a layer of DTOs is rooted in the "Services Share Contract, Not Class" tenet. You can still adhere to this tenet without having a separate classes representing the types exposed through your service interface. After all, the fact that there is a class behind the interface is abstracted away by XML. If there are minor differences, such as different names used for class properties and schema elements. These can be aligned by applying XML serialization attributes on the type, there is no need to introduce an entire layer of DTOs to overcome subtle differences.
Studies show that up to 70% of the lifecycle cost of software is maintenance. This is an important issue, because even if you did not have to write the generated code, you will still have to maintain it. If you use code generation, there will certainly be times when this code needs to be examined closely. At such times the initial productivity gain of generated code disappears. Round-tripping is another serious issue with code generation. If you have modified the generated code, these modifications are often lost when you regenerate the code. There are some clever solutions to these problems such as C# partial classes and subclassing generated code, but this does not solve the problem. The maintenance issues with generated code still prevail. Another maintenance issue is that you often become dependant on the code generation tool you are using, upgrading to a later version can be cumbersome and switching to a tool from another vendor is virtually impossible.
As I mentioned in an earlier post, the Enterprise Java Beans architecture was very cumbersome to work with. To make development easier, many IDEs added sophisticated code generation support to iron out all the EJB-wrinkles. Examples were generation of XML-generation code for domain objects, data access classes and so forth. The generated code made EJB development a little easier, but this was often at the expense of maintainability. The type of code generation offered is very similar to what the Web Service Software Factory does.

Above you can see the source tree for the completed Web Service Software Factory Hands on Labs. With the client application and tests excluded, the project consists of 32 types. This is an awful amount of code taking into account that the only business capability this solution has is to register a new club member. Only six of the classes in the solution are business related, the rest is infrastructure code. If we look at some of the infrastructure code, it becomes apparent that this code is not easy to maintain. The following snippet is taken from the ClubMembershipInsertFactory class.
if (clubMember.AddressLine1 != null)
{
db.AddInParameter(command, "vch_AddressLine1",
DbType.String, clubMember.AddressLine1);
}
if (clubMember.AddressLine2 != null)
{
db.AddInParameter(command, "vch_AddressLine2",
DbType.String, clubMember.AddressLine2);
}
if (clubMember.City != null)
{
db.AddInParameter(command, "vch_City", DbType.String,
clubMember.City);
}
if (clubMember.ClubExpiration != null)
{
db.AddInParameter(command, "dte_ClubExpiration",
DbType.DateTime, clubMember.ClubExpiration);
}
if (clubMember.PhoneNumber != null)
{
db.AddInParameter(command, "chr_PhoneNumber",
DbType.String, clubMember.PhoneNumber);
}
if (clubMember.State != null)
{
db.AddInParameter(command, "vch_State", DbType.String,
clubMember.State);
}
if (clubMember.ZipCode != null)
{
db.AddInParameter(command, "chr_ZipCode", DbType.String,
clubMember.ZipCode);
}
The snippet is just a fraction of the code, but the remainder is similar to this. The class has a Cyclomatic Complexity of 40. The code is generated by the factory, but it has to be maintained by hand because the factory does not generate a fully usable class. Maintaining mundane, repetitive code like in the example above is very error prone and it has virtually no relevance to the business capability of registering a new club member.
Another ability of the Web Service Software Factory is mapping between the Data Transfer Objects used by the service interface and the domain types used by the business layer. The factory has a recipe that creates the mapping logic, but the features are limited to simple mappings between coherent primitives. There is no type coercion, even in simple scenarios where one side of the map is a numeric type while the other side is string. Neither can object graphs be mapped by the recipe. You need to write the mapping code for this.
// Create mapping for property WinePreference
// Create mapping for property CreditCard
Most of the transformation and aggregation done in DTOs could have been achieved with by introducing composite objects and through inheritance. However, because of how the XML serializer handles inheritance composition and inheritance cannot be used because it affects the reflected schema and there are some limitations on how you can redefine XML serialization of inherited members.
Because they only contain data and have no behavior, DTOs aren't "real" objects. Placing mapping responsibility in DTOs help justify their existence, so from my point of view its better to do mapping and type coercion in the DTOs rather than introducing helper classes with no relevance to the business domain to bridge the gap between the schema and domain model.
public CreditCardDTO(CreditCard domainObject)
{
expirationDate = domainObject.ExpirationDate;
cardIssuer = domainObject.CardIssuer;
cardNumber = domainObject.CardNumber;
nameOnCard = domainObject.NameOnCard;
}
public static implicit operator CreditCard(CreditCardDTO dto)
{
CreditCard domainObject=new CreditCard();
domainObject.CardIssuer = dto.cardIssuer;
domainObject.CardNumber = dto.cardNumber;
domainObject.ExpirationDate = dto.ExpirationDate;
domainObject.NameOnCard = dto.nameOnCard;
return domainObject;
}
The snippet above is from my light-weight implementation of the Membership Service from the Web Service Software Factory HoL. The constructor enables creation of a DTO from a domain object and the implicit cast operator allows the DTO to be casted to the domain object. This approach cannot be used in all scenarios, but it is sufficient for the majority of mappings needed in an average service.
A benefit of using casting to map from the DTO to the domain object is that the DTO can be passed as an argument to the domain layer without the explicit conversion code in the service facade.
I just revealed that I have written an alternative implementation of the Coho membership service. Below you can see the source tree for this solution.
To make the comparison easy, I haven't made any significant changes to the domain model. The service interface is 100% compatible with the Coho membership service from the hands on lab. Still there are substantial differences between these solutions. Below you can see a side-by-side comparison of key application metrics for the two solutions.
Coho Membership Service:
- Number of IL instructions: 2098
- Number of lines of code: 319
- Number of classes: 25
- Number of types: 32
- Number of abstract classes: 0
- Number of interfaces: 7
- Number of value types: 0
Coho Light Membership Service:
- Number of IL instructions: 909
- Number of lines of code: 162
- Number of classes: 11
- Number of types: 14
- Number of abstract classes: 1
- Number of interfaces: 2
- Number of value types: 1
The lighter version only has half as much code as the original version, but still it has more features. For instance adding an advanced search capability to the domain model of the light version would require virtually no code.
public Member[] FindAllExpiredMembers()
{
return ClubMember.FindAll(Expression.Lt("Expiration",
DateTime.Now));
}
It would require many more in the orginal example.
This is possible because I've chosen to use the most appropriate frameworks for the different parts of the solution. Castle Active Record is used for data access. By making this choice, all my data access needs are satisfied and can omit writing my own data access layer. I've also chosen to take a simple approach to mapping between domain and data transfer objects. The solution doesn't need the amount of DTOs it has, but I generally think it is a good idea to avoid applying XML serialization attributes to the domain model. Still the DTOs in my version are much lighter, and the mapping code has strong cohesion with the DTOs making it much easier to keep them in sync.
Key non-functional abilities of any architecture is that it is maintainable and extensible. Let's imagine that the Coho Wine club wanted to log member registrations and present a KPI dashboard for this. Using the code produced by the Web Service Software Factory this could be done by adding the logging code to the RegistrationManager class in the business logic assembly, or we could add it to the ClubMembershipService implementation class. Both of these classes contain business code, but is KPI logging a core business capability of the solution? And what if other similar requirements surfaced, would these be implemented in the same places? If so, the business logic suddenly be responsible for doing many things which aren't part of the core domain. This would affect the maintainability of the solution. A better solution is to externalize this from the service by either using a decorator on the service or Aspect Oriented Programming.
public class KpiAspect : IMethodInterceptor
{
private IKpiService kpiService;
public object Invoke(IMethodInvocation invocation)
{
ClubMember clubMember=invocation.Arguments[0]
as ClubMember;
if (clubMember!=null)
{
kpiService.NotifyRegistration(clubMember);
}
return invocation.Proceed();
}
}
Above you can see an example of a typical logging aspect. The lightweight service uses Castle Windsor to resolve dependencies. In combination with Aspect## this aspect can applied by changing the application's configuration.
Using the right the frameworks for the specific tasks is the best way to ensure productive developers, maintainable and extensible solutions. The main problem with the Web Service Software Factory is that it aims to solve too many problems at once.
"Web service applications are more than just the technologies that send and receive SOAP messages and expose WSDL contracts—they also include the functionality that is needed to fulfill the service's behavior."
The above statement is true, but it does not mean that the Web Service Software Factory needs to address all of these challenges. For instance, it is not a given that a web service has its own database. Neither is it given that a service has its own domain model, if fact most services are built on top of existing solutions with a existing domain model.
The Web Service Software Factory seeks to be a one-stop shop for designing ASMX and WCF messages and interfaces, apply exception shielding and exception handling, translating messages to and from business entities, designing, building and invoking the data access layer, validating conformance of service implementation and applying security to WCF. To most of these challenges there are better solutions than the ones prescribed by the Web Service Software Factory. AOP is a superior solution for exception shielding and validation, and real ORMs are superior to the data access support in the factory.
Instead of attempting to boil the ocean, the focus should be on the challenges with web services. I could use a factory that helps me design messages and service interfaces, and I'd be really happy if I had a tool that would help me apply and configure security to service endpoints, but I don't want an entire architecture stack along with that.
Microsoft recently announced the Policy Injection Application Block. This block is more or less an AOP extension for the Object Builder and it can become a valuable tool for keeping cross-cutting concerns like exception shielding out of the business layer. Still it will require a substantial investment to reengineer an application built with the Web Service Software Factory to use the Policy Injection Application Block because the design does not use the Object Builder to resolve dependencies.
"Increased flexibility. The Service Factory is carefully designed according to proven practices, but it is also open and highly customizable. Architects and development leads can customize the factory to include the conventions, policies, and practices specific to a team or organization."
The answer to my frustrations with software factories is that I can customize them to meet my needs. The economical model for a software factory is built on the assumption that you build a series of similar solutions and hence can get good return on investment for the time required to build or customize a factory. Whenever I find myself doing the same thing over and over again I look for opportunities to introduce a framework to do the repetitive work. This helps me keep the codebase manageable and increase the maintainability of a solution. It also makes the development process repeatable because the framework is a new asset to our development process. Identifying and reusing assets is a core discipline within the software factories methodology. The problem with using code generation to increase productivity is that the opportunities to create an asset are easily overlooked.
Most of the issues that the Web Service Factory addresses can be solved with frameworks rather than code generation. As mentioned earlier there are numerous AOP frameworks that can be used for cross-cutting concerns. There are many mature data access frameworks and ORMs available that provide good abstractions over the low-level ADO.NET APIs. Spring.NET has the ability to export a class as a web service, without the need to define an ASMX. If we had an XML marshalling framework similar to Castor XML for the .NET platform we could easily do away with most of the DTOs needed transform domain objects to service contracts.
Most successful tools are built to do one thing and do that thing well. The Web Service Software Factory tries to address too many concerns at once. Other factories, such as the Web Client Software Factory appear to be much more focused on the problem at hand. Still I believe that there is too much code generation going on and this is a design smell. Popular web application frameworks like Ruby on Rails or Spring Web Flow have the same capabilities as the Web Client Software Factory and they require less code to be written.
Maintainability is the enabler of all the other "itities" of an architecture. Experience shows that typical EJB applications often was hard to maintain even if code generation made them easy to develop. The IDE driven code generation for EJB paved way for vendor lock-in. The software factories of today will be part of Visual Studio "Orcas", but I would not count on this to lock users into the IDE and Microsoft Patterns & Practices way. Even with million-dollar investments in high-end application servers the enterprise Java community has largely moved from EJBs to light-weight frameworks. This has proven to be a economically healthy choice because of improved productivity and flexibility.
(End note: If you wonder what the first EJB lesson is; read: http://hammett.castleproject.org/?p=117)