Blazing Fast Reflection
[Update: To those of you who've followed a link from Javablogs; this post is on .NET. There is a Java version of it here.]
On Friday I wrote a short post on different alternatives for implementing the ICloneable interface. From my point of view the most important consideration usually is maintainability, but in some cases you really need the best performance you can get. For maintainability there is nothing like using serialization to clone an object. The downside with serialization is that it is rather slow. If performance is what you're after, nothing beats hard coded cloning. Still there is a middle of the road option which has acceptable performance and does not require developers to maintain cloning logic by hand; optimized reflection.
You have a couple of options for doing optimized reflection on .NET. In general you will use the CodeDOM to generate the optimizers on .NET 1.x, while you'll use Lightweight Code Generation (LCG) on .NET 2.0. LCG bridges the gap between purely dynamic invocations and statically bound calls. The LCG feature provides the means for generating dynamic methods at runtime. I haven't checked this, but I suspect that the Iron* languages makes heavy use of LCG beneath the covers. One important difference between .NET 1.x AssemblyBuilders and LCG the LCG's dynamic methods are reclaimable by the garbage collector, while dynamic assemblies stay in memory. Another is that the LCG supports delegate / callback style invocation out of the box.
For this example I'll be using NHibernate's bytecode enhancement support to demonstrate how you can speed up common reflection operation using reflection optimizers. NHibernate supports both .NET 1.x-style CodeDOM-based optimizations and .NET 2.0 LCG with delegate invocation. While my performance comparisons used the LCG to dynamically implement the entire deep-clone operation, I'll stick to a shallow clone algorithm here to make it easier to follow.
To clone any object a client can do the following:
Customer original=customers["ALFKI"];
Customer clone=Cloner.Clone(original);
The Cloner.Clone operation is equally simple:
public static T Clone<T>(T obj)
{
IReflectionOptimizer optimizer =
GetReflectionOptimizer(typeof(T));
T clone =
(T) optimizer.InstantiationOptimizer.CreateInstance();
optimizer.AccessOptimizer.SetPropertyValues(
clone, optimizer.AccessOptimizer.GetPropertyValues(obj)
);
return clone;
}
An NHibernate.Bytecode.IReflectionOptimizer implementation is retrieved from a local cache of optimizers (each type gets its own optimizer). Then the instantiation optimizer is used to create a new instance. This optimizer will call the default constructor on the type, which is faster than Activator.CreateInstance. When we have an instance of T, we copy all of the field values via the access optimizers GetPropertyValues and SetPropertyValues methods.
Don't get mislead by the method names here, NHibernate defaults to invoking property setters and getters, hence the naming. Let's make things clearer by looking at how we create the reflection optimizer instance.
private static Dictionary<Type,IReflectionOptimizer> reflectionOptimizers=new Dictionary<Type, IReflectionOptimizer>();
private static IReflectionOptimizer GetReflectionOptimizer(Type type)
{
if (!reflectionOptimizers.ContainsKey(type))
{
FieldInfo[] fields = FlattenInheritanceHierarchy(type);
IGetter[] getters =
new IGetter[fields.Length];
ISetter[] setters =
new ISetter[fields.Length];
for (int i = 0; i < fields.Length; i++)
{
getters[i] = new FieldGetter(
fields[i],
type,
fields[i].Name);
setters[i] = new FieldSetter(
fields[i],
type,
fields[i].Name);
}
BytecodeProviderImpl bytecodeProvider = new BytecodeProviderImpl();
reflectionOptimizers.Add(
type,
bytecodeProvider.GetReflectionOptimizer(
type,
getters,
setters)
);
}
return reflectionOptimizers[type];
}
NHibernate comes with a number of different getter and setter implementations, and here we use the FieldGetter and FieldSetter types. These types do what the box says; they either get or set the value of a field on a type. They also implement the IOptimizableGetter and IOptimizableSetter interfaces which generate the IL-code required to get direct access to the field values without reflecting over the type. Below you can see the IL-generation method for the FieldSetter class
public void Emit(ILGenerator il)
{
il.Emit(OpCodes.Stfld, field);
}
This method works in concert with the ReflectionOptimizer which will prepare the field for assignment and place it on the top of the stack. Below you can see the code required to create the light-weight method used to set the field value. Parts of the original implementation have been eluded for brevity, please refer to the NHibernate source code for the full version.
private SetPropertyValuesInvoker GenerateSetPropertyValuesMethod(
IGetter[] getters, ISetter[] setters)
{
System.Type[] methodArguments =
new System.Type[]
{
typeof(object),
typeof(object[]),
typeof(SetterCallback)
};
DynamicMethod method =
CreateDynamicMethod(null, methodArguments);
ILGenerator il = method.GetILGenerator();
// Declare a local variable used to store the object reference (typed)
LocalBuilder thisLocal = il.DeclareLocal(typeOfThis);
il.Emit(OpCodes.Ldarg_0);
EmitCastToReference(il, mappedType);
il.Emit(OpCodes.Stloc, thisLocal.LocalIndex);
for (int i = 0; i < setters.Length; i++)
{
// get the member accessor
ISetter setter = setters[i];
System.Type valueType = getters[i].ReturnType;
IOptimizableSetter optimizableSetter = setter as IOptimizableSetter;
// load 'this'
il.Emit(OpCodes.Ldloc, thisLocal);
// load the value from the data array
il.Emit(OpCodes.Ldarg_1);
il.Emit(OpCodes.Ldc_I4, i);
il.Emit(OpCodes.Ldelem_Ref); // If this is a value type, we need to unbox it
if (valueType.IsValueType)
{
// if (object[i] == null), create a new instance
Label notNullLabel = il.DefineLabel();
Label nullDoneLabel = il.DefineLabel();
LocalBuilder localNew = il.DeclareLocal(valueType);
il.Emit(OpCodes.Dup);
il.Emit(OpCodes.Brtrue_S, notNullLabel);
il.Emit(OpCodes.Pop);
il.Emit(OpCodes.Ldloca, localNew);
il.Emit(OpCodes.Initobj, valueType);
il.Emit(OpCodes.Ldloc, localNew);
il.Emit(OpCodes.Br_S, nullDoneLabel);
il.MarkLabel(notNullLabel);
il.Emit(OpCodes.Unbox, valueType);
// Load the value indirectly, using ldobj or a specific opcode
object specificOpCode = typeToOpcode[valueType];
if (specificOpCode != null)
{
il.Emit((OpCode) specificOpCode);
}
else
{
il.Emit(OpCodes.Ldobj, valueType);
}
il.MarkLabel(nullDoneLabel);
}
else
{
if (valueType != typeof(object))
{
il.Emit(OpCodes.Castclass, valueType);
}
} // using the setter's emitted IL
optimizableSetter.Emit(il);
}
// Setup the return
il.Emit(OpCodes.Ret);
return (SetPropertyValuesInvoker)
method.CreateDelegate(typeof(SetPropertyValuesInvoker));
}
Unless you're fluent in IL, this code might be a little cryptic. The whole thing is really rather simple. First of all, a dynamic method is defines. Then the values to be assigned to the individual fields are retrieved from an array. Each of these elements is then prepared for assignment (boxed, casted to the appropriate type etc.). Finally, the value to be assigned is placed on the stack and the setter is asked to set the value. Finally a delegate that can be used to invoke the dynamic method is created and returned to the caller.
This delegate will be used by the AccessOptimizer to set the property values (see the GetReflectionOptimizer method at the start of this article).
The AccessOptimizer uses a combination of a delegate and a callback for the best possible performance.
public void SetPropertyValues(object target, object[] values)
{
// The setDelegate is the delegate for the light-weight
// method.
setDelegate(target, values, setterCallback);
}
private void OnSetterCallback(object target, int i, object value)
{
setters[i].Set(target, value);
}
That's it. A lot of IL voodoo magic, but that's what makes it fast. So just how much faster is is than naïve schoolbook reflection?
Assigning the values of the Customer class shown to the left to the same instance one million times takes about a minute on my laptop using pure refleciton. Using bytecode enhanced reflection it takes a mere ~175 milliseconds to do the same thing. So maybe reflection isn't that slow after all?