« Presented without comment | Main | Official TechEd NZ drinking game »

August 19, 2008

In case of emergency, use SDK

I think it was New Jersey.  I'm pretty sure it was ASP.NET 1.1.  I know it was a stupidly early hour of the morning, and I know I was jetlagged.  The phone was already ringing, but it wasn't my alarm call.  It was the sales guy.  He'd fired up his laptop to practise the demo we'd come umpty thousands of miles to do, and he was getting nothing but the Yellow Screen of Death.  I woke up very, very quickly.

The error message told me what the problem was.  It was an environmental problem (Active Directory credentials caching, not that it matters), and I knew there was no way of getting the failing component working again without physically crossing the Atlantic.  Fortunately, the component was loaded through configuration, so we could replace the problem component with any other component that implemented the right interface, such as a null object-type component that always returned a happy value.  Just one problem: all we have is a sales laptop.  That means no API documentation, no source code, and no Visual Studio.

What would you do in this situation?  Calling back to head office is an option, but time is short.  Rescheduling the demo might be possible, but then again it might not: and it's taken months to get this far with the customer, so we don't want to blow that because of a half-hour hold-up.  Cursing development's lack of foresight in not having a null component and/or sales' lack of foresight in not practising away from the home network is legitimate and therapeutic, but not actually helpful.

If we can think of the problem as a coding task, of course, it's trivial: write a null object that always returns happy values.  One line of code.  The only difficulty comes from the lack of tools to plumb that one line into the larger system.  So let's restate the problem.  How can we solve the problem with the tools we have, and still have time for breakfast?

We need to do four things: figure out the signature of the interface we have to implement, write the trivial implementation, build the trivial implementation, and replace the real but broken implementation with our trivial one.

How can we figure out the signature of the interface?  web.config tells us the name of the real component, which includes its assembly name.  We can load that assembly into ildasm, which is part of the framework and is therefore present even on the sales laptop, and inspect the real class and the interface we want to stub out.

How can we write the trivial implementation?  Even sales laptops have Notepad.  It would be a horrible editor to write real code in, but for a null object, all we need is a few usings, a class and method declaration, and  return true;.  We'll probably get a few compiler errors through silly jetlagged typing mistakes, but that's fine, compiler errors are cheap.

How can we build the trivial implementation?  The framework includes csc.exe, a command-line compiler for C#.  The csc help option tells us how to reference the DLL where the interface is defined, specify our source file and send the output to a DLL file.  We may need a bit of trial and error to figure out if we need to reference any other DLLs, but that's fine too, compiler errors are still cheap.

How can we replace the real implementation with our trivial one?  This is SOP: copy our new DLL file to the Web site's bin directory and edit web.config.

Ten minutes of furrowed-browed incantations later, the demo is working like a charm, I get to have breakfast and a much-needed cup of tea after all, Techie Mystique Preferred is up 20 points on the day, and the sales guy owes me vast quantities of beer.

The point of relating all this is not to impress you with my ability to write null objects while in an advanced state of tea and bacon deficiency.  The point is that, as developers, we're not helpless when away from our usual development environments.  Using ildasm, Notepad and csc isn't a pleasant or productive way to develop, but it's better than having a vital sales opportunity go down in flames because you don't have a copy of Visual Studio to hand.

The reason I'm reminded of this is a debugging session that happened recently on the local .NET user group mailing list.  A poster was getting an unhandled exception on a customer machine that she couldn't reproduce on her test rig, and that was bypassing all the diagnostic and logging mechanisms built into her program.  The customer insisted that the fault must be in the software, and demanded that the developer's company resolve it, but was understandably protective of a production machine, and would certainly not allow the developer to install Visual Studio on it.  After many hours of effort, the developer was still stuck in a nasty double bind: the customer was hurling blame and howling for a fix, but the developer couldn't get any information on which to establish the problem or build that fix.

Again, the difficulty was not that the developer had to do anything enormously complicated.  The exception was occurring reliably and during startup; the developer had wrapped every bit of user code in diagnostics, so any problems had to be happening before the program entered user code, and there couldn't be a whole lot happening there.  So just catching the exception would almost certainly be enough to track down the fault.  The difficulty was that the customer would not let the developer use her normal tools to do so.

When you look at the situation in this light, it suggests an obvious alternative plan: is there another tool that the customer will let you use?  The .NET Framework SDK includes two debuggers, cordbg and mdbg.  A little experimentation determines that cordbg requires only a couple of DLLs to work, and can be xcopy deployed.  No need for admin rights, and no permanent footprint on the target machine.

It turned out the customer was willing to sanction this.  Armed with cordbg, it took the developer a matter of minutes -- most of which was having to stop and read the cordbg help screen -- to track the fault down to a missing XML end tag where the customer had edited the configuration file, and her boss a matter of seconds to make out a colossal and gleeful invoice.  (Great quote from James Hippolite: "Charge like a wounded bull.")

Again, the reason for relating this is not to celebrate anyone's cleverness in tracking down a fault or to blame the customer for making a mistake.  Nor is it to suggest that anybody should be using cordbg as their debugger of choice.  It's to illustrate again that the value of being able to survive away from your usual tools.

Once in a while, you'll have a real business problem to solve, and for whatever reason you won't be able to use your usual tools to solve it.  This won't happen often, but when it does, knowing that the SDK tools exist, and having the confidence to try them out, will make the difference between success and failure.  It's the classic "rich vs. reach" argument: IDEs like Visual Studio are powerful but hard to deploy; SDK tools are relatively weak, but you can use them almost anywhere.

Is it worth knowing what the SDK tools are or how to use them?  Not in any great detail.  The stories above were more than three years apart, and involved different tools.  If I'd put serious time into learning cordbg, csc and/or ildasm, I wouldn't have got much of a return on my investment.  But I didn't have to.  In the unhandled exception example, I knew there must be a debugger in the SDK, but I couldn't even remember what it was called, let alone know how to use it.  But for the DNUG poster's problem, all it took to correct that was a quick look at the SDK directory and reading a couple of help screens.  In the null object example, I couldn't be sure that my Notepad-written code would compile, or that I'd remembered all the DLLs I had to reference.  (Au contraire.)  But I was pretty sure that csc would give me useful clues as to where I'd gone wrong.  Nobody uses the SDK tools day in and day out; consequently, they're designed for occasional usage, even improvisational usage.

You don't want to be having to improvise.  In the first example, we should have done more testing away from the network, or had a null object ready.  In the second example, maybe they should have had commercial arrangements in place for debugging access in case of production errors.  Easy to say with hindsight.  But the reality is that sometimes you're in a situation where improvisation is forced upon you.  And when that happens, having the confidence to crack open the SDK can make all the difference.

August 19, 2008 in Software | Permalink


TrackBack URL for this entry:

Listed below are links to weblogs that reference In case of emergency, use SDK:


Luxury, when I were a lad twere a matter of faxing people what to type into edlin to update configuration files.

Posted by: Harvey Pengwyn at Aug 19, 2008 11:35:57 PM