My co-worker (Jamus) and I were talking about memoization in Ruby on Rails and the implementation is so clean. Basically, it replaces the original method with a call to the memoizer which checks for a cached value which it either returns or replaces. I wanted something like this in C# (seeing as how I am a C# developer).
The blog I was reading that introduced me to the memoizer had a generic memoization method that didn't work, but with a little tweaking, it did just fine. You basically passed your method call through this generic method and it would handle first checking for your cached value before passing the call through to your original method.
The problem was, what if you had a recursive method? I was playing around with a factorial method (of course) that looked like this:
static int factorial(int n)
{
return n < 2 ? 1 : n * factorial(n - 1);
}
// test cases
//factorial(7);
//factorial(9);
//factorial(9);
The first test made 7 iterations through the factorial method, the second 9, and the third 0. I thought about it and it seemed like I'd be better off if I made each call to the memoizer, rather than the original method. That way, my first test would make 7 iterations, my second test 2, and the third test 0. The problem was, with the generic memoizer method, my factorial method would thus have to be aware of the memoizer and I didn't want that.
That's when Jamus introduced me to PostSharp. PostSharp is a library that modifies your code at compile time providing Aspect Oriented Programming capabilities to .NET. This was almost exactly what I was looking for . . . a way to cleanly separate the concerns of my application from the memoization concerns. PostSharp allowed me to take the memoization aspect out of the implementation of my methods and put them nicely into a method attribute. This is just a first version proof of concept so I haven't optimized it much, but I think the idea has a great deal of potential, so here it is:
public static class Memoizer
{
// private field to store memos
private static Dictionary<string, object> memos = new Dictionary<string, object>();
// PostSharp needs this to be serializable
[Serializable]
public class Memoized : OnMethodInvocationAspect
{
// intercept the method invocation
public override void OnInvocation(MethodInvocationEventArgs eventArgs)
{
// get the arguments that were passed to the method
object[] args = eventArgs.GetArgumentArray();
// start building a key based on the method name
// because it wouldn't help to return the same value
// every time "lulu" was passed to any method
StringBuilder keyBuilder = new StringBuilder(eventArgs.Delegate.Method.Name);
// append the hashcode of each arg to the key
// this limits us to value types (and strings)
// i need a better way to do this (and preferably
// a faster one)
for (int i = 0; i < args.Length; i++)
keyBuilder.Append(args[i].GetHashCode());
string key = keyBuilder.ToString();
// if the key doesn't exist, invoke the original method
// passing the original arguments and store the result
if (!memos.ContainsKey(key))
memos[key] = eventArgs.Delegate.DynamicInvoke(args);
// return the memo
eventArgs.ReturnValue = memos[key];
}
}
}
That's all there is to it, I modified my factorial method to look like this:
[Memoizer.Memoized]
static int factorial(int n)
{
return n < 2 ? 1 : n * factorial(n - 1);
}
Using this technique, I achieved the results I was looking for. The first time I called factorial(9), it checked the cache, did't find an entry, created one, recursed until it got to factorial(7) where it found a memo, returned it, and stopped the recursion.
This method is pretty expensive and could get pretty memory intensive, but I'm on the lookout for a long process (especially a long recursive process) where I can test this out to see if it produces savings over the long run.
Very nice idea! :)
ReplyDeleteAwesome! But shouldn't you be requiring that all the parameters are IEquatable and / or are providing their own GetHashCode (and not default to Object.GetHashCode)? This could prevent some pretty hard to detect runtime errors.
ReplyDeleteYou should'n use GetHashCode() because it is not guaranteed to be unique.
ReplyDelete