« December 2008 | Main | February 2009 »

January 22, 2009

Wellington .NET user group: C# 4 and CLR 4.0

Thanks to everyone who came along to the C# 4 talk last night! For those who are interested, here are the slides.

C# and .NET 4.0 slides (PPTX format, 156K) (you may need to do a right-click and Save As)

January 22, 2009 in Software | Permalink | Comments (1) | TrackBack

January 10, 2009

Scala for C# programmers, part 3: pass by name

"You're only on part 3 and you're already reduced to writing about calling conventions?  You suck!  Do another post about chimney sweeps being hunted by jars of marmalade!"

Silence, cur.  Pass by name is not as other calling conventions are.  Pass by name, especially in conjunction with some other rather theoretical-sounding Scala features, is your gateway to the wonderful world of language extensibility.

What is Passing By Name?

First, let's talk about what we mean by calling convention.  A calling convention describes how stuff gets passed to a method by its caller.  In the good old days, this used to mean exciting things like which arguments got passed in registers and who was responsible for resetting the stack pointer.  Sadly, the days of being able to refer to "naked fun calls" and make the Beavis and Butthead noise are consigned to history: in modern managed environments, the runtime takes care of all this guff and the main distinction is pass data by value or by reference.  (The situation on the CLR is slightly complicated by the need to differentiate passing values by value, values by reference, references by value and references by reference, but I'm not going to go into that because (a) it's irrelevant to the subject at hand and (b) that's Jon Skeet's turf and I don't want him to shank me.  Again.)

In pass by value, the called method gets a copy of whatever the caller passed in.  Arguments passed by value therefore work like local variables that are initialised before the method runs: when you do anything to them, you're doing it to your own copy.

In pass by reference, the called method gets a reference to the caller's value.  When you do anything to a pass-by-reference argument, you're doing it to the caller's data. 

In pass by name, the called method gets... well, it's a bit messy to explain what the called method gets.  But when the called method does anything to the argument, the argument gets evaluated and the "anything" is done to that.  Crucially, evaluation happens every time the argument gets mentioned, and only when the argument gets mentioned.

Not Just Another Calling Convention

Why does this matter?  It matters because there are functions you can't implement using pass by value or pass by reference, but you can implement using pass by name.

Suppose, for example, that C# didn't have the while keyword.  You'd probably want to write a method that did the job instead:

public static void While(bool condition, Action body)
{
  if (condition)
  {
    body();
    While(condition, body);
  }

}

What happens when we call this?

long x = 0;
While(x < 10, () => x = x + 1);

C# evaluates the arguments to While and invokes the While method with the arguments true and () => x = x + 1.  After watching the CPU sit on 100% for a while you might check on the value of x and find it's somewhere north of a trillion.  Why?  Because the condition argument was passed by value, so whenever the While method tests the value of condition, it's always true.  The While method doesn't know that condition originally came from the expression x < 10; all While knows is that condition is true.

For the While method to work, we need it to re-evaluate x < 10 every time it hits the condition argument.  While needs not the value of the argument at the call site, nor a reference to the argument at the call site, but the actual expression that the caller wants it to use to generate a value.

Same goes for short-circuit evaluation.  If you want short-circuit evaluation in C#, your only hope if to get on the blower to Anders Hejlsberg and persuade him to bake it into the language:

bool result = (a > 0 && Math.Sqrt(a) < 10);
double result = (a < 0 ? Math.Sqrt(-a) : Math.Sqrt(a));

You can't write a function like && or ?: yourself, because C# will always try to evaluate all the arguments before calling your function.  Consider a VB exile who wants to reproduce his favourite keywords in C#:

bool AndAlso(bool condition1, bool condition2)
{
  return condition1 && condition2;
}

T IIf<T>(bool condition, T ifTrue, T ifFalse)
{
  if (condition)
    return ifTrue;
  else
    return ifFalse;
}

But when C# hits one of these:

bool result = AndAlso(a > 0, Math.Sqrt(a) < 10);
double result = IIf(a < 0, Math.Sqrt(-a), Math.Sqrt(a));

it would try to evaluate all the arguments at the call site, and pass the results of those evaluations to AndAlso or IIf.  There's no short-circuiting.  So the AndAlso call would crash if a were negative, and the IIf call if a were anything other than 0.  Again, what you want is for the condition1, condition2, ifTrue and ifFalse arguments to be evaluated by the callee if it needs them, not for the caller to evaluate them before making the call.

And that's what pass by name does.  A parameter passed by name is not evaluated when it is passed to a method.  It is evaluated -- and re-evaluated -- when the called method evaluates the parameter; specifically when the called method requests the value of the parameter by mentioning its name.  This might sound weird and academic, but it's the key to being able to define your own control constructs.

Using Pass By Name in Scala

Let's see the custom while implementation again, this time with Scala "pass by name" parameters:

def mywhile(condition: => Boolean)(body: => Unit): Unit =
  if (condition) {
    body
    mywhile(condition)(body)
  }

We can call this as follows:

var i = 0
mywhile (i < 10) {
  println(i)
  i += 1
}

Unlike the C# attempt, this prints out the numbers from 0 to 9 and then terminates as you'd wish.

Pass by name also works for short-circuiting:

import System.Math._

def andAlso(condition1: => Boolean, condition2: => Boolean): Boolean =
  condition1 && condition2

val d = -1.234
val result = andAlso(d > 0, Sqrt(d) < 10)

The andAlso call returns false rather than crashing, because Sqrt(d) < 10 never gets evaluated.

There Are No Miracles in Our Profession, Trurl!

What's going on here?  What's the weird colon-and-pointy-sticks syntax?  What is actually getting passed to mywhile and andAlso to make this work?

The answer is a bit surprising.  Nothing is going on here.  This is the normal Scala function parameter syntax.  There is no "pass by name" in Scala.

Here's a bog-standard pass by value Scala function declaration:

def myFunc1(i: Int) : Unit = ...

Takes an integer, returns void: easy enough.  Here's another:

def myFunc2(f: Int => Boolean) : Unit = ...

Even if you've not seen this kind of expression before, it's probably not too hard to guess what this means.  This function takes a function from Int to Boolean as its argument.  In C# terms, void MyFunc2(Func<int, bool> f).  We could call this as follows:

myFunc2 { (i: Int) => i > 0 }

So now you can guess what this means:

def myFunc3(f: => Boolean) : Unit = ...

Well, if myFunc2 took an Int-to-Boolean function, myFunc3 must be taking a blank-to-Boolean function -- a function that takes no arguments and returns a Boolean.  In short, a conditional expression.  So we can call myFunc3 as follows:

val j = 123
myFunc3 { j > 0 }

The squirly brackets are what we'd expect from an anonymous function, and because the function has no arguments Scala doesn't make us write { () => j > 0 }, even though that's what it means really.  The anonymous function has no arguments because j is a captured local variable, not an argument to the function.  But there's more.  Scala also lets us call myFunc3 like this:

val j = 123
myFunc3(j > 0)

This is normal function call syntax, but the Scala compiler realises that myFunc3 expects a nullary function (a function with no arguments) rather than a Boolean, and therefore treats myFunc3(j > 0) as shorthand for myFunc3(() => j > 0).  This is the same kind of logic that the C# compiler uses when it decides whether to compile a lambda expression to a delegate or an expression tree.

You can probably figure out where it goes from here:

def myFunc4(f1: => Boolean)(f2: => Unit) : Unit = ...

This takes two functions: a conditional expression, and a function that takes no arguments and returns no value (in .NET terms, an Action).  Using our powers of anticipation, we can imagine how this might be called using some unholy combination of the two syntaxes we saw for calling myFunc3:

val j = 123;
myFunc4(j > 0) { println(j); j -= 1; }

We can mix and match the () and {} bracketing at whim, except that we have to use {} bracketing if we want to batch up multiple expressions.  For example, you could legally equally well write the following:

myFunc4 { j > 0 } { println(j); j -= 1; }
myFunc4 { println(j); j > 0 } (j -= 1)
myFunc4 { println(j); j > 0 } { j -= 1 }

And we'll bow to the inevitable by supplying a body for this function:

def myFunc5(f1: => Boolean)(f2: => Unit) : Unit =
  if (f1()) {
    f2()
    myFunc5(f1)(f2)
  }

Written like this, it's clear that f1 is getting evaluated each time we execute the if statement, but is getting passed (as a function) when myFunc5 recurses.  But Scala allows us to leave the parentheses off function calls with no arguments, so we can write the above as:

def myFunc5(f1: => Boolean)(f2: => Unit) : Unit =
  if (f1) {
    f2
    myFunc5(f1)(f2)
  }

Again, type inference allows Scala to distinguish the evaluation of f1 in the if statement from the passing of f1 in the myFunc5 recursion.

And with a bit of renaming, that's mywhile.  There's no separate pass by name convention: just the usual closure behaviour of capturing local variables in an anonymous method or lambda, a bit of syntactic sugar for nullary functions (functions with no arguments), just like C#'s syntactic sugar for property getters, and the Scala compiler's ability to recognise when a closure is required instead of a value.

In fact, armed with this understanding of the Scala "syntax," we can easily map it back to C#:

void While(Func<bool> condition, Action body)
{
  if (condition())
  {
    body();
    While(condition, body);
  }
}

int i = 0;
While(() => i < 10, () =>
{
  Console.WriteLine(i);
  ++i;
});

The implementation of the While method in C# is, to my eyes, a bit clearer than the Scala version.  However, the syntax for calling the While method in C# is clearly way more complicated and less natural than the syntax for calling mywhile in Scala.  Calling mywhile in Scala was like using a native language construct.  Calling While in C# required a great deal of clutter at the call site to prevent C# from trying to treat i < 10 as a once-and-for-all value, and to express the body at all.

So that's so-called "pass by name" demystified: the Scala Web site, with crushing mundanity, demotes it to "automatic type-dependent closure construction," which is indeed exactly how it works.  As we've seen, however, this technical-sounding feature is actually essential to creating nice syntax for your own control constructs.  We'll shortly see how this works together with other Scala features to give you even more flexibility in defining your construct's syntax.

January 10, 2009 in Software | Permalink | Comments (7) | TrackBack

January 08, 2009

Scala for C# programmers, part 2: singletons

In C#, if you want to create a singleton object, you have to create a class, then stop evildoers creating their own instances of that class, then create and provide an instance of that class yourself.  While this is hardly a Burma Railway of the programming craft, it does feel like pushing against the grain of the language.  Nor is it great for maintainers, who have to be able to recognise a singleton by its spoor ("Private constructor... public static readonly field... hand me the elephant gun, Carruthers."), or for clients, who have to use a slightly clumsy multipart syntax to refer to the singleton (e.g. Universe.Instance).

What would be easier for all concerned would be if you could just declare objects as singletons.  That is, instead of writing class Universe and public static readonly Universe Instance, you could just write object Universe.

And that's exactly what Scala allows you to do:

object Universe {
  def contains(obj: Any): Boolean = true  // duh
}

val v = Universe.contains(42)

What's going on behind the scenes here?  It pretty much goes without saying that the Scala compiler is creating a new type for the singleton object.  In fact it creates two types, one for the implementation and one for the interface.  The interface looks like a .NET static class (actually, the .NET 1.x equivalent, a sealed class with only static members).  Thus, a C# program would call the example above as Universe.contains(42).

Singleton objects are first-class citizens in Scala, so they can for example derive from classes.  This is a nice way of creating special values with custom behaviour: you don't need to create a whole new type, you just define an instance and override methods in it:

abstract class Cat {
  def humiliateSelf()
}

object Slats extends Cat {
  def humiliateSelf() { savage(this.tail) }
}

Obviously this is a frivolous example, but "special singletons" turn out to be an important part of the functional idiom, for example for bottoming out recursion.  Scala by Example (PDF) describes an implementation of a Set class which is implemented as a tree-like structure ("left subset - member - right subset"), and methods such as contains() work by recursing down to the child sets.  For this to work requires an EmptySet whose implementation (state) and behaviour are quite different from non-empty sets -- e.g. contains() just returns false instead of trying to delegate to non-existent child sets.  Since EmptySet is logically unique it is both simpler and more efficient to represent it as a singleton: i.e. to declare object EmptySet instead of class EmptySet.

In fact the whole thing can become alarmingly deep: Scala by Example also includes a description of Boolean as an abstract class, and True and False as singleton objects which extend Boolean and provide appropriate implementations of the ifThenElse method.  And fans of Giuseppe Peano should definitely check out the hypothetical implementation of Int...

January 8, 2009 in Software | Permalink | Comments (4) | TrackBack

January 03, 2009

Scala for C# programmers, part 1a: mixins and traits, behind the scenes

I wrote yesterday about how Scala's equivalent to interfaces, traits, could include implementation.  Given that you can use DLLs built in Scala from C#, this naturally prompts the idea of using Scala traits to get maximal interfaces with minimal implementation in C#:

1. Define trait in Scala and compile into DLL.
2. Reference DLL from C# project.
3. Create C# class which implements the trait.
4. Implement only the abstract members.
5. Get the implemented members for free.
6. Take over the world!

Don't take out the home loan on that extinct volcano just yet, though.  Like all plans to take over the world, this one is just a little optimistic.

In Scala, a trait is a type, but at the CLR level, it compiles to two types: an interface representing the trait, and a class containing any implementation.  The class is required because a CLR interface can't contain implementation.  (Scala could emit the trait as an abstract class, but then a class wouldn't be able to extend more than one trait because of the CLR's single inheritance limitation.)  The Enumerable example, therefore, if it were written in C#, would actually look like this (slightly simplified):

public interface Enumerable
{
  Enumerator getEnumerator();
  int count();
}

public class Enumerable$class
{
  public static int count(Enumerable $this)
  {
    /* implementation */
  }
}

So if you were to create a C# class that implemented Enumerable, your class would have to implement the count() method itself, even though the trait included a default count() implementation when it was defined in Scala.

The Scala compiler, on the other hand, knows to look for the trait implementation class, and sees that it provides an implementation for count().  It therefore automatically adds a count() method to any class that extends Enumerable.  The compiler-supplied method body just calls the count() method of Enumerable$class.  That is, the mixing in of the trait implementation is a Scala-specific language feature, performed at compile time by Scala and not visible to the .NET type system.

This shouldn't come as any surprise to anyone, but it is a nice illustration of the way that languages can add features over and above what is built into the CLR and specified in the CTS, but that such features will not be mappable into other CLR languages.

January 3, 2009 in Software | Permalink | Comments (0) | TrackBack

January 02, 2009

Scala for C# programmers, part 1: mixins and traits

Okay, so despite the pain of getting it working on .NET, there's still a lot of buzz around Scala.  Why is that?

Scala is a hybrid of functional and object-oriented languages.  Its functional aspects make it very expressive when writing algorithmic code, and play nicely with the brave new world of concurrency; its object-oriented aspects keep it familiar and convenient when creating business objects or other stateful models.  Moreover, because it embraces object-orientation as a first-class idiom, Scala is able to use platform (Java or .NET) libraries very naturally, and conversely programs written in C# or VB.NET can easily use assemblies written in Scala.

That said, remember that Scala's primary platform is the Java virtual machine, and some of the interest in Scala comes from Java programmers' interest in features such as type inference, comprehensions and lambdas, with which C# 3 programmers are already familiar.  So what's left that might be of interest specifically to C# programmers?

I was going to answer this with a big long list, but it started turning into an epic, so instead I'm going to take it in smaller doses, starting with mixins and traits.

Motivation

Interfaces in C# and Java play a dual role.  First, they are a set of capabilities that an implementer has to, well, implement.  Second, they are a feature set that a client can use.  These two roles are at loggerheads.  The first means that interfaces want to be minimal, so that implementers don't have to implement a whole lot of superfluous and redundant guff.  The second means that interfaces want to be maximal, so that clients don't have to clog themselves up with boilerplate utility methods.

Consider, for example, IEnumerable (and its sister interface IEnumerator).  This is a very minimal interface: implementers just need to be able to produce values in sequence.  But this minimalism means that clients of IEnumerable need to write the same old boilerplate again and again and again: foreach loops to filter, foreach loops to call a method on each element of the sequence, foreach loops to aggregate, foreach loops to check whether all elements meet a criterion, or to find the first member that meets a criterion, or...  This is frustrating because the implementations of "filter," "apply," "aggregate," and so on are always the same.  Of course, we could put these methods into concrete types (List<T> includes several), but then those concrete types will contain duplicate code, and users who only have an IEnumerable will still miss out.  And yet we can't put these methods into the interface because then every implementer of IEnumerable would have to implement them -- and they'd end up writing the same boilerplate, just now in all the zillions of IEnumerable classes instead of their clients.

The C# and Scala Solutions

We could resolve this tension if we had a way for interfaces to contain implementation: for example, if IEnumerable required the implementer to provide the class-specific iteration functionality, but then provided the standard implementations of "filter," "apply", "aggregate" and so on automatically:

public pseudo_interface IEnumerable
{
  IEnumerator GetEnumerator();  // must be implemented
  IEnumerable Filter(Predicate predicate)  // comes for free
  {
    foreach (object o in this)
      if (predicate(o))
        yield return o;
  }
}

C# 3 addresses this using extension methods: the methods mentioned above are all in fact included as extension methods on IEnumerable<T> as part of LINQ.  This has some advantages over the approach described above: specifically, the "standard methods" aren't bound up in the interface, so you can add your own methods instead of being limited to the ones that the interface author has included.  On the other hand, it means that method implementations have to be packaged in a different class from the interface, which feels less than modular.

Scala takes a different approach.  A Scala trait can contain a mix of abstract interface and concrete implementation.  (It can also be a pure interface.)  Here's a Scala trait that represents objects that can be compared and ordered:

trait Ord {
  def < (that: Any): Boolean
  def <=(that: Any): Boolean = (this < that) || (this == that)
  def > (that: Any): Boolean = !(this <= that)
  def >=(that: Any): Boolean = !(this < that)
}

Orderable objects need to extend Ord, but only need to implement <.  They then get the other operators for free, implemented automatically by Ord in terms of <.

class Date extends Ord {
  def < (that: Any): Boolean = /* implementation */
}

// can now write myDate >= yourDate

The term traits and mixins (because you are "mixing in" the functionality of Ord into Date) are both used in Scala: I think trait refers to the interface, and mixin to when a class extends a trait containing concrete implementation, but I may be missing the nuances of the exact usage.

Scala Traits vs. C# Extension Methods

Okay, so Scala has a different way of packaging standard implementations from C#'s extension methods.  It's different, but why is it interesting?  Well, there are a couple of things that you can do with Scala traits that don't fall nicely out of the extension methods approach.

First, you can override the default implementations of trait members, to take advantage of additional information or capabilities available in the implementing type.  Let's look at another IEnumerable example, recast as a Scala trait:

trait Enumerable {
  def getEnumerator(): Enumerator
  def count: Int = {
    // Shockingly bad style; for illustrative purposes only
    var c = 0
    val e = getEnumerator()
    while (e.moveNext()) c = c + 1
    c
  }
}

This (ignoring style issues for now) is the only fully general implementation we can provide for count: it will work with any Enumerable.  But for collections that know their sizes, such as arrays or List<T>, it's gruesomely inefficient.  It iterates over the entire collection, counting elements one by one, when it could just consult the size member and return that.  Let's fix that:

class MyList extends Enumerable {
  private var _size;
  def getEnumerator(): Enumerator = /* ... */
  override def count: Int = _size
}

The count member of the Enumerable trait works like a virtual method: it can be overridden in classes which implement/derive from Enumerable.

Compare this to the Count() extension method on IEnumerable<T> in LINQ.  This achieves the same effect by trying to cast to ICollection, which is fine as far as it goes but isn't extensible.  Suppose you create an enumerable class that can count itself quickly but isn't a collection -- for example a natural numbers range object.  With a Scala trait, the NumberRange type could provide an efficient override of count, just like any other virtual method; with C# extension methods, Enumerable.Count() would have to somehow know about the NumberRange type in advance, or fall back on counting elements one by one.

Second, with Scala you can choose a trait implementation when you instantiate an object, rather than having it baked in at the class level once and for all.  Suppose you're creating a MyList instance, but you want it to puff itself up to look bigger so as to frighten other MyList instances off its territory.  (This example would probably work better with fish, but we're stuck with Enumerables now.  Work with me here.)  In C#, you'd need to create a PuffedUpMyList class and override the Count property.  In Scala, you can just mix in a PuffedUp version of the trait:

trait PuffedUp extends Enumerable {
  override def count: Int = super.count + 100
}

val normal = new MyList
Console.WriteLine(normal.count)  // meh
val puffedUp = new MyList with PuffedUp
Console.WriteLine(puffedUp.count)  // eek!

As you can imagine this gives us much better granularity and composability of traits and implementations than we get from the extension methods approach, or indeed from single implementation inheritance type systems in general.

So Scala traits have some distinct advantages over extension methods.  The only downside appears to be the inability for clients to add their own methods to a trait after the fact.  Fortunately, you can work around this in Scala using its implicit conversions feature.  The delightfully named Pimp My Library pattern effectively gives us back extension methods, albeit less elegantly, giving Scala programmers the best of both worlds.

January 2, 2009 in Software | Permalink | Comments (10) | TrackBack

Fear and loathing in Scala 2.7.2 for .NET

Spoke too soon.  Apparently the Scala compilation process for .NET is actually meant to be this horrible.

I'm not sure why the stated reason ("separate compilation") is considered desirable (at least for users, as opposed to for compiler writers), but I'm guessing it's something to do with being able to target different versions of the CLR, and/or to control options such as strong-naming, PDB generation, etc. without having to bog scalac-net down with truckloads of options.

Still makes for a sucky user experience, though.

January 2, 2009 in Software | Permalink | Comments (1) | TrackBack

January 01, 2009

Getting started with Scala on .NET

The Scala language has been getting a lot of buzz recently, but I hadn't realised until recently that Scala can compile to the .NET CLR as well as to the Java virtual machine.

Unfortunately, CLR support lags behind JVM support and the CLR support documentation on the Scala Web site leaves out a few steps.  So here is a summary of what you'll need to do to compile a Scala program to target .NET.  These notes are written with reference to Scala 2.7.2; see also the remarks at the end.

1. Ensure your machine's copy of Java is up to date.

2. Download Scala and unzip it to your folder of choice.

3. Install the .NET support.  To do this, open a command prompt in the Scala bin directory and type sbaz install scala-msil.  (sbaz, Scala Bazaars, is a Scala tool for downloading, installing and managing packages, similar to Ruby gems.)  This adds scala-net (the Scala interpreter) and scalac-net (the Scala compiler) to the bin directory, and adds a couple of required .NET assemblies to the lib directory.

4. Write your program.  I'm going to assume it's in a code directory under the Scala directory, in a file called test.scala.

5. Compile your program to MSIL.  To do this, open a command prompt in the Scala directory and run bin\scalac-net code\test.scala.  The output from this will be one or more IL files.  (Even if you have only a single source file you may see multiple IL files.  The compiler seems to emit one IL file per class, plus one for the pot.)  By the way, your firewall may pitch a hissy fit during this stage: this is because scalac-net wants to use the "fast Scala compiler," which runs as a separate process and stays resident so as to avoid having to spin up a JVM every time you compile, and this requires a loopback network connection for the inter-process communication.

6. Compile the MSIL assembly language into a .NET assembly using ilasm.  (Don't look slack-jawed at me.  You heard.)  To do this, open a Windows SDK command prompt, go to the directory containing the MSIL files, and run ilasm code\test.msil code\MyClass.msil (or whatever files Scala has generated).  (This will generate an EXE; provide the /DLL switch to generate a library.)

7. Copy predef.dll and scalaruntime.dll from the Scala lib directory to the directory where you compiled your .NET assembly (in this example, the code directory: copy lib\*.dll code).

8. Run your EXE and experience the Hello Worldy goodness.

It's not actually meant to be quite as horrible as this.  (Update: it is meant to be as horrible as this.)  Steve Gilham reports that, for him, running scalac-net alone sufficed to generate the EXE without all the mucking around with ilasm.  This also works for me using Scala 2.7.1 but not with 2.7.2 or 2.7.3 RC1.  2.7.1 is available from the downloads page via the Previous Versions link.

Note that if you are using 2.7.1 or a fixed future version, scalac-net needs ilasm to be on the path in order to create the EXE directly.  Depending on your setup the Visual Studio command prompt may do the job, or you may need to use the SDK command shell or manually add the .NET Framework directory to the path.  Also note you will still need to copy the Scala DLLs to the target directory: unlike Visual Studio, scalac-net/ilasm does not do this for you.

Okay, so why would you go through all this grief when Visual Studio and C# are sitting there on the nice graphical Start menu promising to love you long time?  Stay tuned.

January 1, 2009 in Software | Permalink | Comments (6) | TrackBack