Saturday, June 6, 2009

Andand.NET via monads

In my previous post, I discussed a direct implementation of andand in C#. This time, I'll show a monadic implementation.

As Justin Wick pointed out on Reg Braithwaite's blog announcing Object#andand, andand can be viewed as a special case of the Maybe Monad. I've been hearing about how useful monads are for a few years, but until recently I hadn't occasion to use them directly. Would a monad formulation of andand be simpler and clearer to read? Would it be easier to understand, if not for everyone then at least for readers who are already familiar with the monad formalism? Here was an opportunity to find out.

For those who are totally unfamiliar with monads: they're a means of accommodating side-effects within the functional paradigm. By structuring input, output, exceptions, etc. with monads, you can restore referential transparency to functions that make use of these kinds of features. Monads are surprisingly versatile. You can build lists and arrays from monads, and in the case of arrays the use of monads makes it easy to use the type system to guarantee that client code fulfills a contract on which an efficient implementation of arrays depends. The famous applications of all the functional goodies in LINQ, such as data access and data-parallel computation, are organized around monads.

So, I refreshed my memory with Monads for Functional Programming and Confessions of a Used Programming Language Salesman followed some of their references: Comprehending Monads, Guarded Recursive Datatype Constructors, and Generalized Algebraic Data Types and Object-Oriented Programming. Thus fortified, I was ready to write some tests:


public class TestMonad<M> where M : MonadImpl<M>, new() {
[Test]
public void leftUnit() {
var a = new Object();
Assert.AreEqual(a, Monad.Star(Monad.Unit<Object, M>(() => a), b => Monad.Unit<Object, M>(() => b)).Just());
}

[Test]
public void rightUnit() {
var a = new Object();
var m = Monad.Unit<Object, M>(() => a);
Assert.AreEqual(
Monad.Just(m),
Monad.Just(
Monad.Star(m, b => Monad.Unit<Object, M>(() => b))));
}

public Monad<String, M> an(Object o) {
return Monad.Unit<String, M>(() => o.ToString());
}

public Monad<Int32, M> bo(Object o) {
return Monad.Unit<Int32, M>(() => o.GetHashCode());
}

[Test]
public void associative() {
var m = Monad.Unit<Object, M>(() => new Object());
Assert.AreEqual(
Monad.Just(
Monad.Star(
m,
a => Monad.Star(an(a), b => bo((String)b)))),
Monad.Just(
Monad.Star(
Monad.Star(m, a => an(a)),
b => bo(b))));
}
}


I'm following the definitions from Monads for Functional Programming here: a Monad is a triple <M, Unit, ★> where M is a type constructor (generic type), Unit is a function from computations of type a to M a, and ★ applies a function to a monad.

My use of the name "Just" here is regrettable. The Maybe monad's type constructor is conventionally defined as a simple union:
Maybe a = Nothing | Just a
. So, by convention,
Just a
takes a computation of arbitrary type
a
to a monad representing that computation. My function "Just" is the inverse, taking a Monad to the computation it represents. I gather that the primitive monad operators are normally encapsulated so that control of side-effects is well localized/modularized. Usually, function composition on monad values is accomplished by ★, so it's not often necessary to explicitly extract raw computations from monads. Finally, monads seem to be most popular within the Haskell community, and pattern-matching is a pretty ubiquitous in Haskell programs. I think these reasons explain why (as far as I know) there's not much use and no standard name for the operation I call "Just" in my code.

Having translated the mathematical and Haskell definitions of monad into C#, I was ready to check that my translation could extend to an example application shown elsewhere:


[TestFixture]
public class TestMaybeMonad : TestMonad<MaybeMonadImpl> {
private Monad<String, MaybeMonadImpl> lookup(string name, Dictionary<String, Monad<String, MaybeMonadImpl>> db) {
Monad<String, MaybeMonadImpl> result;
db.TryGetValue(name, out result);
return result;
}

[Test]
public void MailSystem() {
//based very loosely on haskell.org's example
// (I don't understand why their mplus is relevant)
var fullnamedb = new Dictionary<String, Monad<String, MaybeMonadImpl>>();
var nicknamedb = new Dictionary<String, Monad<String, MaybeMonadImpl>>();
fullnamedb.Add("nicholas", Monad.Unit<String, MaybeMonadImpl>("n@m.com"));
nicknamedb.Add("nick", Monad.Unit<String, MaybeMonadImpl>("nicholas"));
var mnick = Monad.Unit<String, MaybeMonadImpl>("nick");
Assert.AreEqual(
"n@m.com",
Monad.Star(
Monad.Star(mnick, nick => lookup(nick, nicknamedb)),
name => lookup(name, fullnamedb)).Just());
}


And finally, I can restate andand in terms of monads:

private class D {}
private class C {
public D d;
}
private class B {
public C c;
}
private class A {
public B b;
}
[Test]
public void andand() {
A nil = null;
A a = new A();
A ab = new A() { b = new B() };
Assert.IsNull(MaybeMonad<D>.Andand<A>(() => null));
Assert.IsNull(MaybeMonad<D>.Andand(() => (A)null));
Assert.AreEqual(a, MaybeMonad<D>.Andand(() => a));
Assert.IsNull(MaybeMonad<D>.Andand(() => a.b));
Assert.AreEqual(ab.b, MaybeMonad<D>.Andand(() => ab.b));
Assert.IsNull(MaybeMonad<D>.Andand(() => ab.b.c));
Func<Object> x = null;
Func<Func<Object>> y = null;
y = () => null;
Assert.IsNull(MaybeMonad<D>.Andand(() => y()()));
Assert.IsNull(MaybeMonad<D>.Andand(() => a.b.c));
Assert.IsNull(MaybeMonad<D>.Andand(() => a.b.c.d));
}


OK then, let's see how this is implemented:

public class MonadImpl<M> where M : MonadImpl<M>, new() {
public virtual Monad<B, M> Star<A, B>(Monad<A, M> ma, Func<A, Monad<B, M>> f) {
return f(ma.Just());
}
}
public class Monad<T, M> where M : MonadImpl<M>, new() {
public M impl = new M();
public T Just() {
return a();
}
internal protected Func<T> a;
public Monad<B, M> Star<B>(Func<T, Monad<B, M>> f) {
return impl.Star<T, B>(this, f);
}
public static Monad<A, M> Unit<A>(A a) {
return new Monad<A, M>() { a = () => a };
}
}


A monad represents a computation of arbitrary type and some sort of side-effect whose control motivates the use of monads. Wrapping and unwrapping the computation associated with the monad is a straightforward example of parametric polymorphism. Slightly more interesting is recursive nature of the Monad type, whose method signatures involve the type on which they are defined. Simple parametric polymorphism permits us to define methods that behave uniformly on values of any type. If we want to make use of non-universal structure on those values we can restrict the values to a range of types that share the structure we need. Often, in a recursively-defined type, we want this range of types to itself involve a type parameter. For example, in this case we want Star to take a Monad and to return a Monad of a similar type. This requires f-bounded polymorphism, which is supported in .NET's generics; c#'s where clause not only binds type parameters, but can (re)use those parameters in the bounds.

Managing the side-effects is less straightforward. Here we want uniform packing/unpacking behavior across all computations, but side-effect management behavior that is peculiar to the kind of side-effect captured by the monad. We've had some support for behavioral refinement that follows subtyping relationships since the beginning of .NET, with virtual methods. But the Star function takes a monad (with behavior b and computation c) and a function from c to d and returns a monad with behavior b and computation d. It builds a value whose type is merely related and not identical to any of its type arguments. This is characteristic of Guarded Recursive Datatype Constructors. These g.r. datatypes, also known as Generalized Algebraic Data Types (GADTs), are not yet well-supported in .NET. Fortunately, the work-around shown above is sufficient for seeing the connection between monads and andand.

The Maybe Monad represents "just" a computation, or "nothing." The monad structure makes it easy and convenient to treat Maybes in a uniform way, regardless of whether a particular Maybe represents "nothing" or a non-nothing computation. If you're familiar with the Null Object pattern, you'll recognize the benefit of this uniformity and you'll see that Maybe monad is a more general alternative to Null Object in many contexts.

So finally we're ready to see andand, implemented in terms of the MaybeMonad:

public class MaybeMonadImpl : MaybeMonadImpl<MaybeMonadImpl> {}
public class MaybeMonad<T> : Monad<T, MaybeMonadImpl> {
private static Expression andandifyme(MemberExpression me) {
var objectParameter = Expression.Parameter(me.Expression.Type, "target");
var parameters = new ParameterExpression [] {objectParameter};
//TODO: simplify by using the parameter f rather than by duplicating
// the subexpression: andandify(me.Expression) in slotValue and
// in the Just of Star of Unit below (the subexpression could occur
// solely in slotValue, and then its value could be used by
// dereferencing f in the Just of Star of Unit.
var slotValue =
andandifyinr(
Expression.Invoke(
Expression.Lambda(
Expression.MakeMemberAccess(
objectParameter,
me.Member),
parameters),
andandify(me.Expression)));
var f = Expression.Parameter(me.Expression.Type, "f");
return
Expression.Call(
typeof(Monad),
"Just",
new Type [] { me.Type, typeof(MaybeMonadImpl) },
Expression.Call(
typeof(Monad),
"Star",
new Type [] { me.Expression.Type, me.Type, typeof(MaybeMonadImpl) },
Expression.Call(
typeof(Monad),
"Unit",
new Type [] { me.Expression.Type, typeof(MaybeMonadImpl) },
andandify(me.Expression)),
Expression.Lambda(
Expression.Call(
typeof(Monad),
"Unit",
new Type [] { me.Type, typeof(MaybeMonadImpl) },
slotValue),
f)));
}
private static Expression andandifyi(InvocationExpression i) {
return
andandifyinr(
Expression.Invoke(
andandify(i.Expression),
i.Arguments));
}
private static Expression andandifyinr(InvocationExpression i) {
var l = i.Expression;
var f = Expression.Parameter(l.Type, "f");
return
Expression.Call(
typeof(Monad),
"Just",
new Type [] { i.Type, typeof(MaybeMonadImpl) },
Expression.Call(
typeof(Monad),
"Star",
new Type [] { l.Type, i.Type, typeof(MaybeMonadImpl) },
Expression.Call(
typeof(Monad),
"Unit",
new Type [] { l.Type, typeof(MaybeMonadImpl) },
l),
Expression.Lambda(
Expression.Call(
typeof(Monad),
"Unit",
new Type [] { i.Type, typeof(MaybeMonadImpl) },
i),
f)));
}
private static LambdaExpression andandifyl(LambdaExpression l) {
return
Expression.Lambda(
andandify(l.Body),
Enumerable.Select(
l.Parameters,
p => (ParameterExpression)andandify(p)).ToArray());
}
private static Expression andandify(Expression expr) {
var me = expr as MemberExpression;
if (me != null) {
return andandifyme(me);
}
var i = expr as InvocationExpression;
if (i != null) {
return andandifyi(i);
}
var l = expr as LambdaExpression;
if (l != null) {
return andandifyl(l);
}
return expr;
}
public static U Andand<U>(Expression<Func<U>> expr) {
var aexpr = (Expression<Func<U>>)andandify(expr);
try {
return (U)aexpr.Compile().DynamicInvoke();
}
catch (Exception e) {
throw new Exception(
String.Format(
"trouble evaluating {0}",
aexpr),
e);
}
}
}


By conflating null and Maybe's Nothing, we preserve the simple approach Identity monad can use for Star: just unwrap the computation from the input monad and wrap it in the output monad. This approach doesn't work in general, because there might be state associated with the monads that we'd need to migrate.

Now, comparing the direct implementation to the monadic implementation, you might be ready to decide that monads might be theoretically interesting but are impractical. I think that on one hand, the more stateful monads benefit more fully from the paradigm than the Maybe monad; on the other hand, C# exaggerates the ceremony of monad implementation relative to Haskell. Microsoft hasn't highlighted their use of monads in LINQ or EF or PLINQ, and they haven't exposed any analog to my Monad<T, M> type, so I couldn't share that implementation for andand. In summary, I don't think the monadic implementation of andand carries its weight, but if I saw an opportunity to benefit from other kinds of monads in my program, the benefits might easily outweigh the amortized costs of monads.





Thursday, June 4, 2009

Andand.NET sans monads

I motivated this material in my previous post.

One way to view andand is as a syntactic transformation of the code: the effect we want is to write

a.b.c.d

and have that automatically translated to

(a != null && b != null && c != null) ? a.b.c.d : null


C# doesn't support syntactic abstraction in the style of Lisp macros, but later in this post we'll see that recent versions of C# give us much more flexibility than we had in 2.0 and earlier.

Before we look at the techniques in detail, I want to acknowledge that syntactic regularity is a key motivation for Ruby Object#andand. Ruby Object#andand uses infix "andand" in imitation of the built-in guarded assignment operator &&=. We have no such operator in C#, and I am a prefix sympathizer anyway, so neither of my andand implementations share this emphasis with the Ruby inspiration.

So, my first implementation of andand for C# used LINQ expression tree transformation. The first step is to gain the ability to manipulate the program defined by the source code whose meaning I want to reinterpret (e.g., the compound member access expression a.b.c.d shown above):

public static T Andand<T>(Expression<Func<T>> f) {
try {
f = (Expression<Func<T>>)aar(f);
return f.Compile()();
}
catch (Exception e) {
throw new Exception(String.Format("trouble with {0}", f), e);
}
}

One of the main ideas in LINQ is type-directed expression quoting. Here, I've written a function that takes an Expression tree as an argument. If I apply this function to a lambda expression, the compiler will translate the lambda to a value representing the abstract syntax tree of the lambda rather than generating an anonymous delegate. So, a typical call site for Andand looks like this:


private class D {}
private class C {
public D d;
}
private class B {
public C c;
}
private class A {
public B b;
}

A ab = new A();
ab.b = new B();
Assert.AreEqual(ab.b, Andand(() => ab.b));
Assert.IsNull(Andand(() => ab.b.c.d));


Once I have my expression tree, I'm ready for a transformation. Clearly from my starting example, member access expressions constitute an important case. I'd like to be able to handle some other cases though, like method chaining. I'll give you a taste of tree inspection and construction here; for complete details see andand-sans-monads at googlecode.


private static Expression aar(Expression e) {
var me = e as MemberExpression;
if (me != null) {
return
guard(me.Expression, me);
}
var l = e as LambdaExpression;
if (l != null) {
return
Expression.Lambda(
guard(
l.Body,
l.Body),
Enumerable.Select(
l.Parameters,
p => p).ToArray());
}
var ie = e as InvocationExpression;
if (ie != null) {
var args =
Enumerable.Select(
ie.Arguments,
(arg, i) => Expression.Parameter(arg.Type, String.Format("arg{0}", i))).ToArray();
return
Expression.Invoke(
guard(
ie.Expression,
Expression.Lambda(
nullExpr(ie.Expression.Type),
args),
Expression.Lambda(
ie.Expression,
args)),
Enumerable.Select(
ie.Arguments,
arg => aar(arg)));
}
return e;
}


It sure is a lot of work to construct expression trees "by hand" as opposed to using the type-directed quoting facility. I could be a lot more concise with Lisp quasiquote. Maybe later we'll experiment with using type-directed quotation of expression tree transformations, but for now I was more interested in exploring the connection between andand and monads.

But I have to work tomorrow morning and I'd like you to be able to read this now, so I'll save the monad and GADT discussion for my next post in this series.