September 19, 2009

Static field access performance: the answer

So many views, no answer... Guys, I don't believe this!

Ok, the latest case was the slowest one because JIT compiler substitutes any parameter of reference type to __Canon type during generic type instantiation. This allows to reuse the code generated for a particular generic type instance by other generic instances. So e.g. List<int> and List<long> won't share the same generated code, but List<string> and List<Array> will, because actually both of them will be implicitly transformed to List<__Canon>.

But why this affects on described case? Think how such static variable address is resolved in generated code.
  • In the first two cases (no T, or when T is value type) JIT compiler emits the code that is fully specific for this type. So it knows the exact address of static variable.
  • In the last case (T is reference type) JIT compiler emits the code that must work with any similar T substitution. So actually it can't put any exact address of static variable there. Instead, it resolves it via dictionary. The code it emits does the same as this one: __GetStaticFieldGetter(this.GetType()).Invoke(), where __GetStaticFieldGetter is internal method resolving this delegate via internal dictionary, and crating it, if this is necessary. Of course, actual code is much more efficient - e.g. it returns static field address instead of delegate, but the idea behind is the same.
Compare the cost of static variable lookup in generic type parameterized by reference type to e.g. [ThreadStatic] field access cost or to virtual generic method call cost - they are very similar. And it's fully clear, why: underlying logic in all these cases is almost identical. There is dictionary lookup.

Btw, likely, this case exposes the most severe impact on performance to which __Canon optimization leads. At least, I don't know any other case with the similar impact. So yes, there is always trade between memory consumption and performance :)