« January 2012 | Main | April 2012 »

February 27, 2012

Discriminated unions and interfaces

I wrote earlier about F# discriminated unions, and Chris in comments rightly reminded me that discriminated unions, like C# classes, can have member methods.  While I generally reckon separate functions are more idiomatic than members, there is one case where member methods are invaluable: when you want the discriminated union to implement an interface.

This isn’t as common as C# programmers might expect.  Because discriminated unions generally represent simple containers with minimal behavioural logic, the main interfaces you might think to implement are to do with things like equality and comparison – and F# implements those for you.  (Of course, sometimes you don’t want the default equality or ordering implementations, and that’s another scenario where you need to implement an interface – but in those cases there’s a bit more to it.)  Still, it does come up now and again, particularly with system interfaces or if you’re interoperating with C# code that might be more interface-oriented.

The way you implement an interface on a F# discriminated union is exactly the same as the way you implement it on a F# class: use the interface IWhatever with member… construct.  Let’s take a look.

Suppose we want a discriminated union that we can format for display in different ways, similar to the system DateTime or numeric types.  This capability is represented by the system IFormattable interface.  IFormattable has just one method, ToString, which makes it ideal for short demos.  Here’s a discriminated union that implements IFormattable:

type Booze =
   | SauvignonBlanc of string
   | MethylatedSpirit
   interface IFormattable with
     member this.ToString(format, formatProvider) =
       match this with
       | SauvignonBlanc rgn -> if format = "G" then
                                 "Sauvignon Blanc"
                               elif format = "R" then
                                 sprintf "%s Sauvignon Blanc" rgn
                               else
                                 failwith "unknown format"
       | MethylatedSpirit   -> "Amber Nectar"

There’s two bits to this definition: first, the definition of the discriminated union data type, and second, the interface implementation.  The data type definition – the first three lines – doesn’t contain any surprises.  What’s new is that after defining the various ‘cases’ of the union, we keep going with an interface implementation.  The syntax may look a bit weird if you’re a C# programmer but all this is doing is saying that the type implements IFormattable and here is the implementation of the IFormattable.ToString member.  The implementation itself is standard F# pattern matching and hopefully you can see roughly what it’s doing even if you find pipe characters daunting.

Notice that even though this is a member method we can’t define it separately on each ‘case’ the way we would with a virtual member in C#.  The individual cases of discriminated unions are dumb: everything has to be defined on the union type itself.  (A corollary of this is that interfaces can only be implemented on the union type, not on cases.  You couldn’t implement IDrinkable on the SauvignonBlanc case; you’d have to implement it on the Booze type, and live with the deadly consequences.)

When we put the interface implementation into action, there’s a surprise in store:

> let antifreeze = SauvignonBlanc "Austria";;
val antifreeze : Booze = SauvignonBlanc "Austria"
> antifreeze.ToString("R", CultureInfo.InvariantCulture);;
error FS0501: The member or object constructor 'ToString' takes 0 argument(s) but is here given 2. 
The required signature is 'Object.ToString() : string'. 

In C# terms, F# has gone for an explicit interface implementation instead of C#’s usual implicit interface implementation.  That means the interface member can be accessed only through a reference of the interface type:

> (antifreeze :> IFormattable).ToString("R", CultureInfo.InvariantCulture);;
val it : string = "Austria Sauvignon Blanc" 

You can get around this by defining the ToString() member directly on Booze as a member method, and having IFormattable.ToString() delegate to that member method – this is a common pattern for explicit interface implementation in C#, and it’s idiomatic in F# as well.

Implementing interfaces on discriminated unions isn’t all that common; but if you need to do it, now you know how!

February 27, 2012 in Software | Permalink | Comments (0) | TrackBack

February 25, 2012

Discriminated unions in F#

F# has lots of different ways of expressing composite data: tuples, records, classes, etc.  One way that doesn’t have a direct equivalent in C# or Visual Basic is the discriminated union.

A discriminated union is used when there are different ‘cases’ of the data type, potentially with different kinds of data depending on the case.  That probably sounds rather abstract, so let’s see a simple example:

type Variant =
       | Numeric of int
       | Text of string
       | Empty

There are three ‘cases’ of the Variant type: the Numeric case, which carries an integer value; the Text case, which carries a string value; and the Empty case, which doesn’t carry any additional data.

We can create instances of the Variant type using the ‘constructors’ corresponding to the cases:

> let one = Numeric 1;;
val one : Variant = Numeric 1
> let hello = Text "hello";;
val hello : Variant = Text "hello"
> let empty = Empty;;
val empty : Variant = Empty

If a case carries additional data, then we pass that additional data to the constructor.  If it doesn’t, then the case is effectively a singleton instance, so we’re really just doing a variable assignment.

To process the Variant type, we can use pattern matching:

let print v =
    match v with
    | Numeric n -> printfn "Num %d" n
    | Text s    -> printfn "Txt %s" s
    | Empty     -> printfn "Empty"

Now we’ve seen discriminated unions at work, let’s see how they compare to that ubiquitous C# data structure, the class.

In C#, Variant would be an abstract base class, with three subclasses, Numeric, Text and Empty.  Numeric would have an integer field, Text would have a string field, and Empty would have no fields.

This already points up one important difference between discriminated unions and class hierarchies: discriminated unions are closed.  Users of the F# Variant type can’t create a new ‘case,’ whereas users of the C# Variant type can create a new derived class.  So discriminated unions are good for scenarios where you want to control the set of permitted cases; if you want the type to be extensible, you need to go to a class.

Ignoring this detail, here’s a naive stab at implementing the Variant type in C#:

public abstract class Variant
{
}

public class Numeric : Variant
{
   public int Value { get; set; }
}

// similarly for Text and Empty subclasses

However, this implementation has quite different behaviour from the F# discriminated union.  Discriminated unions have value semantics: that is, an instance of Variant is a value just like an instance of Int32 is a value.  Two Variants that are both Numeric 99 have the same value, and so should be considered equal; furthermore, Numeric 100 is a different value from Numeric 99, and must therefore be a different instance.  The first observation means that the Variant types need to implement custom equality; the second means they have to be immutable.  (The immutability issue may seem a bit subtle, but the reasons for wanting immutability go deep; maybe a subject for a future post.)  So our naive implementation needs quite a bit of fixing:

public class Variant : IEquatable<Variant>
{
   public override Equals(object other) { … }
   public override Equals(Variant other) { /* don’t ask */ }
   public override int GetHashCode() { … }
}

public class Numeric : Variant
{
   private readonly int _value;

   public Numeric(int value)
   {
     _value = value;
   }

   public int Value
   {
     get { return _value; }
   }
}

// similarly for Text subclass

This points up a second important difference between discriminated unions and classes: discriminated unions are much easier to write!  The F# Variant type was entirely defined and implemented in four lines.  In the C# Variant class, it takes four lines just to implement a single attribute of a single subclass (one for the field, one for the constructor assignment, one for the property declaration and one for the property getter implementation)!  And good luck writing a bug-free Equals method on your first go (and maintaining it as you add new classes).  Discriminated unions give you value semantics for free; class hierarchies make you work for it.

On the other hand, the fact that C# makes you write all that code out by hand does give you more control over the behaviour of your type.  If you want to ensure that numeric variants are always positive, you can add a check in the Numeric constructor.  You can’t do that with a discriminated union.  A discriminated union is a dumb container of values; a class is a smart container of business logic.

Anyway, now we’ve got the Variant type implemented, it’s time to do some work with it.  But thinking about implementing the ‘print variant’ function above, there are two very different ways we could tackle this.  The idiomatic approach in C# is to make Print a virtual method on the Variant base class and override it in the derived classes:

public abstract class Variant
{
   public abstract void Print();
}

public class Numeric : Variant
{

   public override void Print()
   {
     Console.WriteLine("Num " + Value);
   }
}

// Similarly for Text and Empty subclasses

This works fine as long as you’re the one defining the class hierarchy.  (It’s a bit verbose, because it repeats the ‘public override void Print()’ signature in every derived class.  But it does work.)  Unfortunately, if you have to add a function to a class hierarchy that somebody else has defined, then things get very dreary:

public static void Print(Variant v)
{
   Numeric n = v as Numeric;
   if (n != null)
   {
     Console.WriteLine("Num " + n.Value);
     return;
   }

   // similarly for Text and Empty subclasses
}

With a discriminated union, you don’t get the choice: there’s no way to define a virtual method and override it on individual cases, so you always have to take the separate function approach.  But thanks to pattern matching, implementing a function across the cases of a discriminated union in F# doesn’t incur all the intrusive logic of the C# version, and it’s actually more concise than even the C# virtual method version.

To sum up, discriminated unions provide a more compact and efficient way than full-fat classes of representing simple heterogeneous data.  You can implement a discriminated union in far less code than a class, and the code to work with a discriminated union is also more concise than the class equivalent.  However, discriminated unions are suitable only when the set of possible cases is known in advance – they aren’t openly extensible the way a class hierarchy is – and they don’t provide a way to encapsulate behaviour within the type.  For example, you would never make the Control hierarchy a discriminated union, because users need to be able to define their own range of controls, and the new control types need to carry their own behaviour (e.g. rendering) within them.  On the other hand, for types like the F# Option type (used for ‘either a value of x, or no value’), the Haskell Either type (traditionally used to express ‘either a result of x, or a failure reason of y’), or even much wider unions like LINQ expression types or node types in an Abstract Syntax Tree, discriminated unions are an ideal choice.

February 25, 2012 in Software | Permalink | Comments (2) | TrackBack