Blazing Fast Reflection

Published 13 June 07 06:20 AM | andersnoras 

[Update: To those of you who've followed a link from Javablogs; this post is on .NET. There is a Java version of it here.]

On Friday I wrote a short post on different alternatives for implementing the ICloneable interface. From my point of view the most important consideration usually is maintainability, but in some cases you really need the best performance you can get. For maintainability there is nothing like using serialization to clone an object. The downside with serialization is that it is rather slow. If performance is what you're after, nothing beats hard coded cloning. Still there is a middle of the road option which has acceptable performance and does not require developers to maintain cloning logic by hand; optimized reflection.

You have a couple of options for doing optimized reflection on .NET. In general you will use the CodeDOM to generate the optimizers on .NET 1.x, while you'll use Lightweight Code Generation (LCG) on .NET 2.0. LCG bridges the gap between purely dynamic invocations and statically bound calls. The LCG feature provides the means for generating dynamic methods at runtime. I haven't checked this, but I suspect that the Iron* languages makes heavy use of LCG beneath the covers. One important difference between .NET 1.x AssemblyBuilders and LCG the LCG's dynamic methods are reclaimable by the garbage collector, while dynamic assemblies stay in memory. Another is that the LCG supports delegate / callback style invocation out of the box.

For this example I'll be using NHibernate's bytecode enhancement support to demonstrate how you can speed up common reflection operation using reflection optimizers. NHibernate supports both .NET 1.x-style CodeDOM-based optimizations and .NET 2.0 LCG with delegate invocation. While my performance comparisons used the LCG to dynamically implement the entire deep-clone operation, I'll stick to a shallow clone algorithm here to make it easier to follow.

To clone any object a client can do the following:

Customer original=customers["ALFKI"];
Customer clone=Cloner.Clone(original);

The Cloner.Clone operation is equally simple:

public static T Clone<T>(T obj)
{
    IReflectionOptimizer optimizer = 
      GetReflectionOptimizer(typeof(T));
    T clone = 
      (T) optimizer.InstantiationOptimizer.CreateInstance();
    optimizer.AccessOptimizer.SetPropertyValues(
      clone, optimizer.AccessOptimizer.GetPropertyValues(obj)
    );
    return clone;
}

An NHibernate.Bytecode.IReflectionOptimizer implementation is retrieved from a local cache of optimizers (each type gets its own optimizer). Then the instantiation optimizer is used to create a new instance. This optimizer will call the default constructor on the type, which is faster than Activator.CreateInstance. When we have an instance of T, we copy all of the field values via the access optimizers GetPropertyValues and SetPropertyValues methods.
Don't get mislead by the method names here, NHibernate defaults to invoking property setters and getters, hence the naming. Let's make things clearer by looking at how we create the reflection optimizer instance.

private static Dictionary<Type,IReflectionOptimizer> reflectionOptimizers=new Dictionary<Type, IReflectionOptimizer>();
private static IReflectionOptimizer GetReflectionOptimizer(Type type)
{
    if (!reflectionOptimizers.ContainsKey(type))
    {
        FieldInfo[] fields = FlattenInheritanceHierarchy(type);
        IGetter[] getters = 
          new IGetter[fields.Length];
        ISetter[] setters = 
          new ISetter[fields.Length];
        for (int i = 0; i < fields.Length; i++)
        {
            getters[i] = new FieldGetter(
               fields[i], 
               type, 
               fields[i].Name);
            setters[i] = new FieldSetter(
               fields[i], 
               type, 
               fields[i].Name);
        }
        BytecodeProviderImpl bytecodeProvider = new BytecodeProviderImpl();
        reflectionOptimizers.Add(
            type, 
            bytecodeProvider.GetReflectionOptimizer(
               type, 
               getters, 
               setters)
        );
    }
    return reflectionOptimizers[type];
}

NHibernate comes with a number of different getter and setter implementations, and here we use the FieldGetter and FieldSetter types. These types do what the box says; they either get or set the value of a field on a type. They also implement the IOptimizableGetter and IOptimizableSetter interfaces which generate the IL-code required to get direct access to the field values without reflecting over the type. Below you can see the IL-generation method for the FieldSetter class

public void Emit(ILGenerator il)
{
	il.Emit(OpCodes.Stfld, field);
}

This method works in concert with the ReflectionOptimizer which will prepare the field for assignment and place it on the top of the stack. Below you can see the code required to create the light-weight method used to set the field value. Parts of the original implementation have been eluded for brevity, please refer to the NHibernate source code for the full version.

private SetPropertyValuesInvoker GenerateSetPropertyValuesMethod(
    IGetter[] getters, ISetter[] setters)
{
	System.Type[] methodArguments = 
        new System.Type[]
            {
                typeof(object), 
                typeof(object[]), 
                typeof(SetterCallback)
            };
	DynamicMethod method = 
            CreateDynamicMethod(null, methodArguments);

	ILGenerator il = method.GetILGenerator();

	// Declare a local variable used to store the object reference (typed)
	LocalBuilder thisLocal = il.DeclareLocal(typeOfThis);
	il.Emit(OpCodes.Ldarg_0);
	EmitCastToReference(il, mappedType);
	il.Emit(OpCodes.Stloc, thisLocal.LocalIndex);

	for (int i = 0; i < setters.Length; i++)
	{
		// get the member accessor
		ISetter setter = setters[i];
		System.Type valueType = getters[i].ReturnType;

		IOptimizableSetter optimizableSetter = setter as IOptimizableSetter;
                  // load 'this'
                  il.Emit(OpCodes.Ldloc, thisLocal);

                  // load the value from the data array
                  il.Emit(OpCodes.Ldarg_1);
                  il.Emit(OpCodes.Ldc_I4, i);
                  il.Emit(OpCodes.Ldelem_Ref);  
		// If this is a value type, we need to unbox it
		if (valueType.IsValueType)
		{
			// if (object[i] == null), create a new instance 
			Label notNullLabel = il.DefineLabel();
			Label nullDoneLabel = il.DefineLabel();
			LocalBuilder localNew = il.DeclareLocal(valueType);

			il.Emit(OpCodes.Dup);
			il.Emit(OpCodes.Brtrue_S, notNullLabel);

			il.Emit(OpCodes.Pop);
			il.Emit(OpCodes.Ldloca, localNew);
			il.Emit(OpCodes.Initobj, valueType);
			il.Emit(OpCodes.Ldloc, localNew);
			il.Emit(OpCodes.Br_S, nullDoneLabel);

			il.MarkLabel(notNullLabel);

			il.Emit(OpCodes.Unbox, valueType);

			// Load the value indirectly, using ldobj or a specific opcode
			object specificOpCode = typeToOpcode[valueType];
			if (specificOpCode != null)
			{
				il.Emit((OpCode) specificOpCode);
			}
			else
			{
				il.Emit(OpCodes.Ldobj, valueType);
			}

			il.MarkLabel(nullDoneLabel);
		}
		else
		{
			if (valueType != typeof(object))
			{
				il.Emit(OpCodes.Castclass, valueType);
			}
		}
                  // using the setter's emitted IL
                  optimizableSetter.Emit(il);
	}

	// Setup the return
	il.Emit(OpCodes.Ret);

	return (SetPropertyValuesInvoker) 
method.CreateDelegate(typeof(SetPropertyValuesInvoker)); }

Unless you're fluent in IL, this code might be a little cryptic. The whole thing is really rather simple. First of all, a dynamic method is defines. Then the values to be assigned to the individual fields are retrieved from an array. Each of these elements is then prepared for assignment (boxed, casted to the appropriate type etc.). Finally, the value to be assigned is placed on the stack and the setter is asked to set the value. Finally a delegate that can be used to invoke the dynamic method is created and returned to the caller.
This delegate will be used by the AccessOptimizer to set the property values (see the GetReflectionOptimizer method at the start of this article).
The AccessOptimizer uses a combination of a delegate and a callback for the best possible performance.

public void SetPropertyValues(object target, object[] values)
{
	// The setDelegate is the delegate for the light-weight
	// method.
	setDelegate(target, values, setterCallback);
}
private void OnSetterCallback(object target, int i, object value)
{
	setters[i].Set(target, value);
}

That's it. A lot of IL voodoo magic, but that's what makes it fast. So just how much faster is is than naïve schoolbook reflection?

image Assigning the values of the Customer class shown to the left to the same instance one million times takes about a minute on my laptop using pure refleciton. Using bytecode enhanced reflection it takes a mere ~175 milliseconds to do the same thing. So maybe reflection isn't that slow after all?

Filed under:

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

# Erik Ulven said on June 14, 2007 4:38 AM:

Useful post, thanks!

# andersnoras said on June 14, 2007 5:06 AM:

Thanks, Erik. I'm glad you liked it.

# Hani@formicary.net said on June 14, 2007 3:10 PM:

Please only aggregate your java category to javablogs, many of your posts are .net related which isn't very useful for java people. Thanks!

# andersnoras said on June 14, 2007 3:17 PM:

@Hani

Sorry about that. It was already fixed by the time you clicked the link. Unfortunately javablogs hadn't updated its main feed yet, so that's why it still there. I've just written a small sample that does the same thing using Hibernate's reflection optimizer and I'll post a blog post on that to repair the damage done.

# Jonas Follesø said on June 14, 2007 3:41 PM:

Excellent post and sample. I'll link back to you tomorrow.

After pointing out the performance problem in the app i mentioned in my post they've started doing more profiling and performance tuning. I'm sure the team will find this post usefull when re-visiting their existing cloning code.

# Anders Nor??s' Blog : Blazing Fast Reflection - For Java said on June 14, 2007 3:46 PM:

PingBack from http://andersnoras.com/blogs/anoras/archive/2007/06/14/blazing-fast-reflection-for-java.aspx

# TrackBack said on June 15, 2007 3:56 AM:
# Anders Norås' Blog said on June 16, 2007 2:04 PM:

With my posts on reflection optimization for .NET and Java , I unconsciously proved a point from an earlier

# Matt Warren said on June 19, 2007 1:40 PM:

Fast IL generated accessors are great, but boxing is a large part of the cost of reflection. You should il generate a method that makes a copy of the object directly, instead of a bunch of methods that read/write the individual fields. Now that would be fast!

# andersnoras said on June 20, 2007 4:32 AM:

@Matt;

You're absolutely right that boxing is expensive. I deliberately showed how to use NHibernate's reflection optimizers to keep the example simple and not to mention immediately available for others to use. In my ICloneable Revisited post (http://andersnoras.com/blogs/anoras/archive/2007/06/07/icloneable-revisited.aspx) I used LCG to emit the actual cloning code at runtime, however this code is a bit buggy so I won't publish it before I sort things out.

# Steinar Dragsnes said on September 29, 2007 12:51 PM:

Hi!

Great article! I think I'll stop calling Activator.CreateInstance from now on. I'll also discuss with the team if we should do something about our cloning logic, because currently it is hand coded... ;)

# Zoran said on February 19, 2008 12:09 AM:

Did you achieve any improvements on deep copy? (This code, and also NHibernate, performs shallow copy.) Also, it is not working on collections (and dictionaries).

# uTILLIty said on February 21, 2008 1:27 AM:

Great article! I would like to know if there is a complete engine out there which can be downloaded and used.

Leave a Comment

(required) 
(optional)
(required) 
Enter the code you see below