« November 2008 | Main | January 2009 »

December 31, 2008

Somebody set us up the brain

Mash asks that I post more about brains.  All I have offhand, however, is an illustration of the dangers of being at the back of the queue when they were handed out.

Slats

December 31, 2008 | Permalink | Comments (1) | TrackBack

December 27, 2008

C# covariance and contravariance by example

One of the new features of C# 4.0 is generic covariance and contravariance.

Admit it: your heart rate just went up.  This is about the most intimidating-sounding, beardy-academic feature to come into C# since they stopped calling it "C Octothorpe."  (Which Windows Live Writer wants me to change to "C Clodhopper."  Unusual dictionary you have there, chaps.)  You feel that at any moment Philip Wadler is going to spring out of a bush and explain to you that it refers to contravariant functors on the poset category of types.  And then he'll make you sit an exam on it.  And then you'll realise that you came out without any trousers on and you're being hunted for sport by a jar of marmalade.

In fact, covariance and contravariance are scary terms for a very simple and familiar concept.

Imagine you have the following function:

Apply(Transform t) { ... }

When you call this function, you don't have to pass a Transform.  You can also pass a reference of any type derived from Transform -- say, RotateTransform or ScaleTransform:

Transform t = new Transform();
Apply(t);  // okay
RotateTransform rt = new RotateTransform();
Apply(rt);  // also okay

Why is this okay?  To the C# compiler, it's okay because RotateTransform derives from Transform.  But conceptually, the reason it's okay is that if Apply can deal with any arbitrary Transform, it can certainly deal with a RotateTransform.

But now suppose Apply looks like this:

Apply(IEnumerable<Transform> ts) { ... }

By the same logic, when you call this function, you don't have to pass something typed as IEnumerable<Transform>.  The C# compiler knows it's okay to pass anything that derives from (implements) IEnumerable<Transform> -- say, List<Transform> or ReadOnlyCollection<Transform>.

But after this point, what you should conceptually be able to do diverges from what the C# compiler will let you do.  Conceptually, you should also be able to pass an IEnumerable<RotateTransform>:

IEnumerable<RotateTransform> rts = new List<RotateTransform>();
Apply(rts);

After all, if Apply can deal with a sequence of arbitrary Transforms, it can surely deal with a sequence of RotateTransforms, right?

Right.

But the C# compiler, in C# 3 and earlier, doesn't see it that way.  Although you and I can work out that IEnumerable<RotateTransform> is compatible with IEnumerable<Transform>, it doesn't derive from IEnumerable<Transform>.  So the C# compiler goes into fits of CS1503 errors and flounces off in a huff.

In summary, the C# compiler understands that it's okay to vary the type of an argument, but doesn't understand that it's okay to vary the type of a generic type parameter.  And generic variance just means fixing that: teaching the compiler that it is okay to pass an IEnumerable<RotateTransform> to a function that expects an IEnumerable<Transform>.

Hang on, though.  The feature is called covariance and contravariance.  Why do we need two names for this stuff?  Isn't it just inheritance?  Let's look in more detail.

When you vary the type of an argument, you can only vary it in the direction of more derived.  Passing a RotateTransform to a function that expects a Transform is okay.  Passing an Object to a function that expects a Transform is not okay.  This is probably so ingrained you don't even have to think about it.

When you vary the type of the generic type parameter to IEnumerable<Transform>, the same rule applies.  A function that can deal with a sequence of Transforms can deal with a sequence of RotateTransforms, but not with a sequence of arbitrary objects.  Same rule, right?

Right.  But only because of a specific characteristic of IEnumerable<T>.  In other cases, it turns out that the rule has to be the other way round: you can only vary the type parameter in the direction of less derived.

For example, suppose we have a function that takes an IComparer<T>:

Compare(IComparer<Transform> c) { ... }

If we use the IEnumerable rule and call this function with an IComparer<RotateTransform>, we have a problem.  The function expects to be able to use the IComparer to compare arbitrary Transform objects.  If it decides to compare a ScaleTransform and a TranslateTransform, our IComparer<RotateTransform> will be dreadfully embarrassed.  We can't use derived types after all.

On the other hand, suppose we call this function with an IComparer<object>.  How will it cope?  Very nicely, thank you.  IComparer<object> can compare arbitrary objects, so it can easily cope with the specific requirement of comparing Transforms.  So we can pass a base type in place of the expected type.

So sometimes the rule is that we can only vary in the direction of more derived, and sometimes the rule is that we can only vary in the direction of less derived (more base).  How do we -- and the C# compiler -- know which rule applies in any given case?

The answer -- simplifying somewhat -- is that it depends on whether the generic type parameter appears in output or input positions.

In IEnumerable<Transform>, Transform appears in an output position.  (It appears in the return value of GetEnumerator.)  Now if Transform appears in an output position, then any user of the generic type -- such as the Apply method -- expects to be receiving Transforms, and knows how to deal with them.  So it can certainly deal with derived types such as RotateTransform.  Moreover, Transform appears only in an output position.  So the Apply method can't bust out a ScaleTransform and try to get our RotateTransform-specific implementation to accept it: there's no in-parameter through which Apply can try to feed us the ScaleTransform.

In IComparer<Transform>, Transform appears in input positions.  (It appears as the inputs to the Compare method.)  Now if Transform appears in an input position, then a user of the generic type -- such as the Compare method -- is going to give us Transforms, and expect us to deal with them.  So we need to be able to deal with Transforms at least, but if we can deal with more things -- i.e. a base type -- then that's not going to do any harm.  Moreover, Transform appears only in an input position.  So the IComparer implementation can't return a base type instance to its user: there's no out-parameter or return value through which we could sneak out an Object which the Compare method wouldn't be expecting.

So the rule is: if a type parameter appears only in an output position, you can vary it in the more derived direction, and if a type parameter appears only in an input position, you can vary it in the less derived (base type) direction.

And in fact this is the terminology that gets used in the C# 4 language.  The interfaces we've been discussing would now have the following signatures:

IEnumerable<out T> { ... }
IComparer<in T> { ... }

The annotations tell the compiler that the annotated parameter appears only in output or input positions, as appropriate.  When defining the generic type, the compiler verifies that this is indeed the case.  When performing type checking, the compiler allows variance of the type parameter up or down the class hierarchy according to how the parameter is annotated.

Note, incidentally, that these annotations are per type parameter, and different parameters can have different annotations.  For example, in Converter<TInput, TOutput>, TInput appears only in input positions, and TOutput-- well, you can guess where TOutput appears.  So suppose we have a method like this:

Parse(Converter<string, Transform> c) { ... }

We could give it a Converter<object, Transform> because a converter that can cope with arbitrary objects will eat strings for breakfast.  We could give it a Converter<string, RotateTransform> because if the Parse method is braced to get an arbitrary Transform back then it will be perfectly happy to get a RotateTransform.  And of course we can vary both type parameters and give it a Converter<object, RotateTransform>.

Note also that if a type parameter appears in both input and output positions, it can't be varied in either direction.  Consider a method that takes an IList<Transform>:

MysteryFunc(IList<Transform> ts) { ... }

Can we safely pass it an IList<object>?  No, because its implementation might look like this:

MysteryFunc(IList<Transform> ts)
{
  Transform t = ts[0];
}

If ts is a List<object> and ts[0] happens to be a Llama, then MysteryFunc will rapidly get a lot more mysterious.  So how about the other direction?  Can we safely pass it an IList<RotateTransform>?  Once again the answer is no:

MysteryFunc(IList<Transform> ts)
{
  ts.Add(new ScaleTransform());
}

For what it's worth, if you use arrays, C# does allow you to vary the array type in the "more derived" direction, for example passing a string[] where an object[] is expected.  As you now know, this would be okay if the array element type appeared only in output positions, but this is not the case with arrays, which leads to trouble:

PimpMyArray(object[] objs)
{
  objs[0] = new Llama();
}

PimpMyArray(new string[1]);

The compiler is happy.  The program, and the Llama, experience the ignominious fate of an ArrayTypeMismatchException.

Very well, one last detail.  I remarked that Transform appeared only in an output position of IEnumerable<Transform>.  Conceptually, that's true: if you have an IEnumerable<Transform>, you can only get Transforms out, you can't put them in.  Syntactically, it's a complete lie.  What appears in an output position is actually IEnumerator<Transform>.  The reason I was able to get away with this cheat is that, in IEnumerator<Transform>, Transform also appears only in output positions.  (It's the type of the Current property, and Current is read-only.)  An output of an output is going to be an output.  If we had an IComparerFactory<T> interface where an IComparer<T> appeared as an output, bearing in mind that in IComparer T appears as an input, things would get messier.  At this point my head for one starts spinning and I enter a bizarre world where I briefly think I understand Haskell monads, and then that damn jar of marmalade comes after me again.  See Eric Lippert's helpfully named article Higher Order Functions Hurt My Brain if you want to know what happens in this kind of situation.

The scary terms?  If you care, covariance refers to the output case, and contravariance to the input case.  My mnemonic for these is that covariance goes in the same direction as normal instance argument type variance, and contravariance goes in the opposite ("contra") direction.  Eric Lippert explains the formal terminology and the type-theoretical underpinnings in part one of his eleven-part (so far) series.

But from the strictly pragmatic point of view, all you need to know is that "covariance and contravariance" means that you can now pass inexact generic types when it's safe to do so, just as you can pass inexact argument types when it's safe to do so.  And that's not too scary at all.

December 27, 2008 in Software | Permalink | Comments (10) | TrackBack

December 21, 2008

ASP.NET Dynamic Data and custom sources

ASP.NET Dynamic Data comes with two data sources: LINQ to SQL and the Entity Framework.  But what if you want to create a Dynamic Data site that uses a different data access technology?  It turns out that there are two things you need to implement, and one change you need to make to the default template.

One prerequisite is that the data access technology must support LINQ, or at least a subset thereof.  Specifically, Dynamic Data creates LINQ queries to select individual records or to filter tables by foreign key or value.  So your platform's LINQ implementation will need basic "where" support but should not need joins or grouping.

With that prerequisite in place, what do you need to do over and above having a LINQ-capable data source?

1. Implement a model provider stack.

Dynamic Data depends on a set of providers to tell it about the model, the tables, the columns and the associations between tables.  The provider stack is also partially responsible for creating query objects.  So you'll need to implement the following four classes:

DataModelProvider.  You need to do two things here: create a list of TableProviders, and create "context objects" on demand.  The context object is the "unit of work" for the Dynamic Data site, analogous to a LINQ to SQL DataContext.

TableProvider.  Again, two main things to do here: create a list of ColumnProviders, and create a query object.  (You'll also need to set up some basic properties like name and data type.)  The query object must implement IQueryable, because this is what Dynamic Data is going to feed its LINQ queries to.

A typical pattern here is that the context object type has a bunch of IQueryable properties, and the DataModelProvider returns a TableProvider for each such property, and that TableProvider's query object is the value of that property on the context object instance.  (This is how LINQ to SQL works, with each Table being an IQueryable property of the DataContext.  Strongly typed units of work in LINQ to LightSpeed also fit this pattern.)  For example the context type contains a Products property of type IQueryable<Product>, giving rise to a Product TableProvider, whose query object for a given context is context.Products.  This is probably easier to show than to describe:

public class MyDataModelProvider : DataModelProvider
{
  private void InitialiseColumns()
  {
    foreach (PropertyInfo property in _contextType.GetProperties().Where(p => typeof(IQueryable).IsAssignableFrom(p.PropertyType))
    {
      _tableProviders.Add(new MyTableProvider(property));
    }
  }
}

public class MyTableProvider : TableProvider
{
  public IQueryable GetQuery(object context)
  {
    return (IQueryable)(_property.GetValue(context, null));
  }
}

ColumnProvider.  Mostly you just need to set up the basic properties like name and data type.  The only complexity comes when you have a column or property that represents an association.  In this case you also need to create an AssociationProvider.

AssociationProvider.  These can be a bit confusing and tricky to implement: firstly, it's not always obvious from the documentation what properties you need to set and to what values, and secondly, you need some way of identifying reverse associations.

Here are the values you need to set for the AssociationProvider base class properties:

FromColumn: the ColumnProvider for which the AssociationProvider is being created.

Direction: as appropriate.

ToTable: the TableProvider corresponding to the target type of the association (the singular type in the case of a one-to-many association).

ToColumn: the ColumnProvider in the ToTable corresponding to the reverse association.

ForeignKeyNames: in a many-to-one association, the name of the foreign key column underlying the relationship.  You don't need to set this property for a one-to-many association.

Let's look at an example.  Consider an association between Albums and Tracks.  Album has a Tracks property which is a collection of Tracks, and Track has an Album property of type Album.  Then the ColumnProvider for Album.Tracks should have an Association whose provider looks like this:

  • Direction: OneToMany
  • FromColumn: the ColumnProvider for Album.Tracks
  • ToTable: the TableProvider for Track
  • ToColumn: the ColumnProvider for Track.Album

And the ColumnProvider for Track.Album should have an Association whose provider looks like this:

  • Direction: ManyToOne
  • FromColumn: the ColumnProvider for Track.Album
  • ToTable: the TableProvider for Album
  • ToColumn: the ColumnProvider for Album.Tracks
  • ForeignKeyNames: { "AlbumId" }

Determining the reverse associations programmatically can be tricky depending on how much metadata the platform exposes.  For example, LightSpeed's metamodel is internal, and the LightSpeed ReverseAssociationAttribute goes on the backing field of an association, not on the property that the ColumnProvider would typically have access to.  (The solution in LightSpeed's case is to locate the backing field by taking advantage of LightSpeed's predictable naming conventions.  But other platforms will demand other solutions.)

2. Implement a custom data source control.

Actually, all you really want to do is implement a custom data source view, but the only way to get Dynamic Data to use a custom data source view is to use a custom data source control.

Your custom data source control can derive from LinqDataSource and just override CreateView to return an instance of your custom data source view.

Your custom data source view requires a little bit more effort.  Again, start by deriving from LinqDataSourceView.  This offers some tempting virtual methods below the level of ExecuteInsert, ExecuteUpdate and ExecuteDelete, such as UpdateDataObject, so you would hope you could just override those and piggyback off the base class Execute methods for tedious things like event flow.  Unfortunately unless your data access technology is very like LINQ to SQL, there's a good chance this won't work, and you'll have to re-implement the Execute methods from the ground up.  The good news is that these are still pretty routine: the main trick is that you'll need the context object from the model provider (because this is your unit of work), and you get this by calling OnContextCreating().

3. Change the site to use the custom data source.

The Visual Studio template for Dynamic Data Web sites specifies an <asp:LinqDataSource> on each of its template pages.  You'll need to changes each of these to refer to your custom data source instead.  You don't need to change the content or attributes, just the control type.

You shouldn't need to make any other changes to the pages, but when specifying the data source in Global.asax.cs, you will of course need to pass an instance of your DataModelProvider instead of a LINQ to SQL DataContext.

December 21, 2008 in Software | Permalink | Comments (1) | TrackBack