Monday, July 25, 2011

C# Yield Keyword, IEnumerable<T>, and Infinite Enumerations

Matryoshka DollsVisual Studio 2005 came with a slew of .net and .net compiler features. One of those features that I particularly enjoy is the yield keyword. It was Microsoft's way of building IEnumerable (or IEnumerator) classes and generic classes around your iterator code block. I'm not going to discuss the yield keyword much in this post because the feature has been around a long time now and the internet is replete with discussions on the topic.

Despite the long life of the yield keyword, I still find it conspicuous when I see it in projects I work on. I suppose it's just rare that I find myself writing my own enumerable. As a result, when I see yield return, it tends to stand out. I started looking around to see how the rest of the programming world uses the yield keyword wondering if I was under-utilizing the flexibility provided.

Specifically, I wondered if I was missing out on the lazy nature of the iterator and the numerous linq extension methods optimized to take advantage of that aspect of iterators. What I mean by that is that an iterator doesn't need to store each sequential value in memory the way a collection would and thus you can use each value without necessarily increasing the memory overhead. Further, you can take advantage of calculations which tend to already be sequential in nature (like the Fibbonacci sequence for example).

The second thing I thought about was an infinite (well, sort of infinite) enumerable. I'm not sure if I feel like it's a bad idea or not so I wrote these examples to be unending? I may eventually find a use for such an iterator and then get burned when someone tries to call Fibbonacci.Min() and the application throws an overflow exception, but I suppose at that point I'll make it a method and take a sanity check variable.

In the meantime, here are a few examples of some iterators I thought were interesting and fun challenges:
static IEnumerable<ulong> Fibbonacci
{
    get
    {
        yield return 0;
        yield return 1;

        ulong previous = 0, current = 1;
        while (true)
        {
            ulong swap = checked(previous + current);
            previous = current;
            current = swap;
            yield return current;
        }
    }
}

static IEnumerable<long> EnumerateGeometricSeries(long @base)
{
    yield return 1;

    long accumulator = 1;
    while (true)
        yield return accumulator = checked(accumulator * @base);
}

static IEnumerable<ulong> PrimeNumbers
{
    get
    {
        var prime = 0UL;
        while (true)
            yield return prime = prime.GetNextPrime();
    }
}

static IEnumerable<List<uint>> PascalsTriangle
{
    get
    {
        var row = new List<uint> { 1 };
        yield return row;

        while (true)
        {
            var last = row[0];
            for (var i = 1; i < row.Count; i++)
            {
                var current = row[i];
                row[i] = current + last;
                last = current;
            }

            row.Add(1);
            yield return row;
        }
    }
}

Some of these methods use a few extension methods I adapted from places in the .net framework:
public static ulong GetNextPrime(this ulong from)
{
    for (var j = from + 1 | 1UL; j < ulong.MaxValue; j += 2)
        if (j.IsPrime())
            return j;

    return from;
}

public static bool IsPrime(this ulong value)
{
    if ((value & 1) != 0)
    {
        var squareRoot = (ulong)Math.Sqrt((double)value);
        for (ulong i = 3; i <= squareRoot; i += 2)
            if (value % i == 0)
                return false;

        return true;
    }
    return value == 2;
}

To keep the test console app clean, of course, I used my favorite IEnumerable.Each() extension method:
public static void Each<T>(this IEnumerable<T> enumerable, Action<T> action)
{
    foreach (var element in enumerable)
        action(element);
}

Here's the sample code:
static void Main()
{
    Fibbonacci.Skip(10).Take(10).Each(Console.WriteLine);
    Console.WriteLine();

    EnumerateGeometricSeries(2).Take(10).Each(Console.WriteLine);
    Console.WriteLine();

    PrimeNumbers.Where(p => p > 600).Take(10).Each(Console.WriteLine);
    Console.WriteLine();

    foreach (var row in PascalsTriangle.Take(10))
    {
        row.Each(element => Console.Write("{0} ", element)); 
        Console.WriteLine();
    }

    Console.ReadLine();
}

// The second set of 10 elements in the Fibbonacci sequence
// 55 89 144 233 377 610 987 1597 2584 4181

// Base of 2 to the first 10 powers
// 1 2 4 8 16 32 64 128 256 512

// The first 10 prime numbers greater than 600
// 601 607 613 617 619 631 641 643 647 653

// The first 10 rows of Pascal's Triangle
// 1
// 1 1
// 1 2 1
// 1 3 3 1
// 1 4 6 4 1
// 1 5 10 10 5 1
// 1 6 15 20 15 6 1
// 1 7 21 35 35 21 7 1
// 1 8 28 56 70 56 28 8 1
// 1 9 36 84 126 126 84 36 9 1

Wednesday, July 20, 2011

.Net ObjectFormatter - Using Tokens in a Format String

If you've already read this article and you don't feel like scrolling through my sample formats, you can jump directly to ObjectFormatter on github to get the source.

At some point in almost every business application, it seems you eventually run into the ubiquitous email notifications requirement. Suddenly, in the middle of what was once a pleasant and enjoyable project come the dozens of email templates with «guillemets» marking the myriad fields which will need replacing with data values.

You concoct some handy way of storing these templates in a database, on the file system, or in resource files. You compose your many String.Format() statements with the dozens of variables required to format the email templates and you move on to the greener pastures of application development.

Now, you've got a dozen email templates like this one:
Dear {0},

{1} has created a {2} task for your approval. This task must be reviewed between {3} and {4} to be considered for final approval.

This is an automagically generated email sent from an unmonitored email address. Please do not reply to this message. No, seriously . . . stop that. Nobody is going to read what you are typing right now. Don't you dare touch that send button. Stop right now. I hate you. I wish I could hate you to death.

Thank you,
The Task Approval Team

No big deal, everything is going swimmingly, and the application goes into beta. Then, it turns out, the stakeholders don't want an email template that looks like that. That was more of a draft really. Besides, you should've already known what they wanted in the template to begin with. After all, it's like you have ESPN or something.

It's important to add information about the user for whom this action is taking place, so this is your new template:
Dear {0},

{1} has created a {2} task for your approval regarding {5}({6}). This task must be reviewed between {3} and {4} to be considered for final approval.

If you have questions, please contact your approvals management supervisor {7}.

This is an automagically generated email sent from an unmonitored email address. Please do not reply to this message. No, seriously . . . stop that. Nobody is going to read what you are typing right now. Don't you dare touch that send button. Stop right now. I hate you. I wish I could hate you to death.

Thank you,
The Task Approval Team

So far so good. You've updated the template, updated your String.Format() parameters, passed QA and gone into production. But, now that users are actually hitting the system, it turns out that you need a few more changes. Specifically, you need to add contact information for the supervisor, remove the originator of the task, and by the way, what kind of sense does it make to put a low end limit on a deadline? Here's your new template:
Dear {0},

A {2} task for {5}({6}) is awaiting your approval. This task must be reviewed by {4} to be considered for final approval.

If you have questions, please contact your approvals management supervisor {7} at {1}.

This is an automagically generated email sent from an unmonitored email address. Please do not reply to this message. No, seriously . . . stop that. Nobody is going to read what you are typing right now. Don't you dare touch that send button. Stop right now. I hate you. I wish I could hate you to death.

Thank you,
The Task Approval Team

Now you have an email template format with various numbers all over the place, a String.Format() call with more parameters than there are tokens, and you have to go through the QA - deployment cycle again.

I've gone through this process on almost every application throughout my career as a software engineer. Hence the ObjectFormatter. Now, my email template looks like this:
Dear {Employee.FullName},

A {Task.Description} task for {TargetUser.FullName}({TargetUser.UserId}) is awaiting your approval. This task must be reviewed by {DueDate} to be considered for final approval.

If you have questions, please contact your approvals management supervisor {Supervisor.FullName} at {Supervisor.PhoneNumber}.

This is an automagically generated email sent from an unmonitored email address. Please do not reply to this message. No, seriously . . . stop that. Nobody is going to read what you are typing right now. Don't you dare touch that send button. Stop right now. I hate you. I wish I could hate you to death.

Thank you,
The Task Approval Team

I find that the ObjectFormatter makes my templating much easier to maintain and much more flexible. It also usually makes my calling code a lot cleaner. Here's an example of the approaches you could take to populate the sample templates:
// plain string formatting
String.Format(template, Employee.FullName, Supervisor.PhoneNumber, Task.Description, String.Empty, DueDate, TargetUser.FullName, TargetUser.UserId);

// if you have a dto already built
ObjectFormatter.Format(template, myDto);

// if you don't have a dto built
ObjectFormatter.Format(template, new { Employee, Supervisor, Task, DueDate, TargetUser });

I've found that most of the time they ask for template changes, they want me to add some value that is already a property on an object that's already in my object graph because of the current email template. That way, when they come tell me they want he target user's name formatted differently, I don't even need to recompile (well, sometimes I do . . . I mean, I can't predict everything). I can implement a lot of changes by using objects I already know I'm passing into the ObjectFormatter.Format() method. Here's the new template with the changes and I didn't have to change a line of code to make it work:
Dear {Employee.FullName},

A {Task.Description} task for {TargetUser.LastName}, {TargetUser.FirstName}({TargetUser.UserId}) is awaiting your approval. This task must be reviewed by {DueDate} to be considered for final approval.

If you have questions, please contact your approvals management supervisor {Supervisor.FullName} at {Supervisor.PhoneNumber}.

This is an automagically generated email sent from an unmonitored email address. Please do not reply to this message. No, seriously . . . stop that. Nobody is going to read what you are typing right now. Don't you dare touch that send button. Stop right now. I hate you. I wish I could hate you to death.

Thank you,
The Task Approval Team

If you'd like to check out the source or use the ObjectFormatter in your own projects, look for ObjectFormatter on github. If you make any cool changes, please let me know and I'll try to figure out how to merge them into the repository.

Tuesday, July 19, 2011

Extension Method to Replace foreach With Lambda Expression

It's pretty often I find myself looping through some enumerable and performing an action on the elements. Sometimes it's just displaying results with Console.WriteLine. Other times I need to do something a little more complicated. In any case, every once in a while, I feel like the foreach statement and the for statement aren't really quite expressive enough.

That's why I have this little guy:
public static void Each<T>(this IEnumerable<T> enumerable, Action<T> action)
{
    foreach (var element in enumerable)
        action(element);
}

It's pretty basic but I like the way it looks and feels. I used it in my blog post about a C# UpTo Extension Method a la Ruby's int.upto method.

Here's a simple demonstration I wrote in LinqPad:
void Main()
{
    Enumerable.Range(1, 5).Each(Console.WriteLine);
}

static class Extensions
{
    public static void Each<T>(this IEnumerable<T> source, Action<T> action)
    {
        foreach (var element in source)
            action(element);
    }
}

In writing this post (and perhaps because I've been spending way too much time with jQuery lately), it occurred to me that I may want to be able to chain my actions with another Each() or with other extensions from Linq perhaps:
void Main()
{
    Enumerable.Range(1, 5).Each(Console.WriteLine).Each(Console.WriteLine);
    // 1 2 3 4 5 1 2 3 4 5
}

static class Extensions
{
    public static IEnumerable<T> Each<T>
        (this IEnumerable<T> source, Action<T> action)
    {
        foreach (var element in source)
            action(element);
   
        return source;
    }
}

The problem is though, that generally you don't want your action to enumerate your enumerable until something is to be done with the results. Instead, you often want it to be executed during the enumeration of your enumerable, so you'd write it like this:
void Main()
{
    Enumerable.Range(1, 10)
        .Each(Console.WriteLine)
        .Where(i => i <= 5)
        .ToList();
    // 1 2 3 4 5 6 7 8 9 10
    
    Console.WriteLine();
    
    Enumerable.Range(1, 10)
        .Each(Console.WriteLine)
        .Take(5)
        .ToList();
    // 1 2 3 4 5

    Console.WriteLine();

    Enumerable.Range(1, 10)
        .Each(Console.WriteLine)
        .Skip(5)
        .Take(5)
        .ToList();
    // 1 2 3 4 5 6 7 8 9 10
}

static class Extensions
{
    public static IEnumerable<T> Each<T>
        (this IEnumerable<T> source, Action<T> action)
    {
        foreach (var element in source)
        {
            action(element);
            yield return element;
        }
    }
}