Why iPhone is a gigantic piece of crap

2 Comments

Yes, this post will be a rant.  Sorry.

It’s become clear that no one at Apple actually bothers to test the iPhone with realistic scenarios, nor are they in the least bit interested in developing a platform in which solutions can be built on top of.  Their opinion that every app needs to live in its own little world, isolated completely from the operating system or other applications, has basically ruined any chance they have at maintaining their market share.  I’m not in the least bit surprised that Android is up 13% in the last year, and iOS is down 7%.

The final straw for me is their complete and utter lack of any sort of turn-by-turn driving directions that actually work.  I don’t mean an app that does this, I mean providing, as a phone, a solution that will get you to the place you want to go – a solution that integrates in with the phone as a whole.

I first bought the iPhone 4S the day it came out.  I had never before owned an iPhone, and was excited about what it could do.  After all, it was the best selling smart phone ever, and pioneered the whole application platform thing, right?  First, to my surprise, it came with absolutely no GPS software at all.  I was forced to spend $60 bucks or so on the TomTom app, which, by itself worked great.  However, the built in map software was the Apple version, with no turn-by-turn guidance.  Since iOS is a completely closed platform, TomTom cannot say “Hey, I should be the default maps app so load me whenever you need to get somewhere!”  This would have been nice.  A real platform would have exactly this.

Instead, if you click on an address in an email, or on a location on a Facebook event, it would load the crap Apple Maps app which forced you to click “next” every time you completed a turn.  You couldn’t even copy and paste, since the TomTom app made you type in a street, then number, then city then state.  It could not parse a pasted whole address.  I would usually have to write down the address on a PostIt note, then re-type it in to the TomTom app.  What a complete joke.

Along comes the new Apple Maps!  Yay!  Enter all sorts of media buzz here.

This version of Maps finally had turn-by-turn directions.  Despite what you read in the news, in general it works fairly well.  I won’t blame Apple for not immediately having the flawless data it took Google years to build.  So, now we have a fairly good maps app that will launch by default from other apps and links.

Of course, this brings me to my next major gripe about the iPhone.  You can’t charge the phone at the same time you use GPS.  No, seriously, you can’t.  You don’t believe me, but it’s true.

My last car, a Jetta TDI, had a little iPhone connector built in to the center armrest to charge iPhones and iPods.  You could also use it to listen to music on the car stereo.  However, when you plug it in, the phone’s built-in speaker is automatically muted.  So, if you’re using GPS, don’t expect to actually hear anything while your phone is being charged.  And, of course, if you don’t charge your phone, you’ll have about an hour of battery life while the GPS is running.  This appears to be a behavior of the phone that applies to every app, as neither TomTom, Google Maps, or Apple Maps will make a sound while it’s charging.  Even if you turn the stereo to “Aux In”, though you’ll be able to hear music on the phone, you won’t get any GPS turn-by-turn directions.

Recently, I got a new car – a 2013 Subaru Outback.  This car doesn’t have an iPhone connector, but it does have Bluetooth.  The dealer even helped me pair my iPhone to the car, so I could talk on the phone while driving.  However, if you have Bluetooth turned on, you won’t hear a peep out of the phone while using GPS; even if you have it turned to Aux-Input.  I’ve given up and decided to keep Bluetooth off, since I use GPS so often in my car.  I’ve tried looking at my phone while driving, but I miss turns every time.  Plus, it’s dangerous.

This blinding limitation irritated me to no end while in San Francisco last month.  I rented a Chevy Cruze, which had a USB connector for a phone.  I was staying with friends and family, most over an hour from the conference, which forced me to drive all over the bay area.  As I’m not familiar with the bay area, I was using GPS the entire time.  And, like all other cars, I of course cannot charge the phone and hear the GPS at the same time.  This caused me to show up to the conference with a phone with about a 50% charge, which usually didn’t even last the day.  By dinner time it was totally dead, and I had no choice but to charge it on long stretches of freeway, then unplug it when I got near the restaurant.

Luckily, I was meeting up with a friend who works at Apple.  He claimed there was a solution for this, and you can configure the audio output when using Bluetooth.

Turns out, on Apple Maps, when the phone is paired you’ll see a little “Bluetooth” logo on the bottom right corner of the map.  If you press it, you can select if the audio will be sent over Bluetooth or to the phone’s built in speaker.  This works okay in theory, however, get this – this is an app feature, not a system setting!  That’s right, this same setting does not exist on Google Maps or on TomTom.  Since Apple insists that every app needs to be an island, they expect every app to implement this “Do you want to hear your phone?” feature themselves.  To make this even more ridiculous, this feature only applies to Bluebooth pairing.  If I want to plug the phone in to a USB connection, which would allow me to charge the phone, there’s of course no setting for that.

Now I’m pretty much forced to use Apple Maps because it’s the only program that seems to be written correctly.  So, the other day I was headed to an event.  I had the event details stored as a Facebook event.  I went to the Facebook app, clicked the event, and clicked on the directions.  The Facebook app now actually opens the Google Maps website!  That’s right, the web site – not even the app.  I guess Facebook got so annoyed with this, they decided to bypass the API completely and just launch the browser.  The website asks if I want to use my current location, which I said yes.  It also has a button that says “Always Open in the Google Maps App”.  First, this always button seems to mean always for that exact URL, meaning the location I happen to be headed to right then.  Second, when it loads the Google Maps app, it only remembers the destination.  I have to then type in my current location, at which point it just prints out the entire list of directions; no turn-by-turn.  If I exit the app, go back in, then I can see that destination in my history, select it, and use turn-by-turn.  None of this is slightly usable while you’re driving, so I have to get this all setup while I’m still parked.  This is a hassle if I just want to use GPS for the last few minutes of the drive, and don’t want GPS yapping at me for the first 30 minutes.  Lately, I’ve just gone back to writing down the address on a PostIt note, opening up Google Maps, and typing it in by hand.  That’s been the only thing that consistently works on this platform.

The fact that absolutely nothing in the iOS world works together, apps cannot play off each other to deliver real solutions for realistic scenarios, Apple doesn’t allow an app to be a “default app” for web browsing, mapping, making calls, texting, etc, means that iOS will never deliver anything more powerful than its most powerful single app.

My next phone will be an Android.

 

Using ENUM types with PostgreSQL and Castle ActiveRecord

Leave a comment

One thing I really love about PostgreSQL is ENUM types, which basically allow you to create enumerations as datatypes and use them just as you would any intrinsic type.  For example, I have an ENUM called UnitsEnum that allows me to represent a unit of measurement (cup, teaspoon, gallon, etc).  I then use this data type in various database columns, functions, views, whatever.

These data types are simple to create with the following command:

CREATE TYPE UnitsEnum AS ENUM ('Unit', 'Teaspoon', 'Tablespoon', 'FluidOunce', 'Cup', 'Pint', 'Quart', 'Gallon', 'Gram', 'Ounce', 'Pound');

From that point on, you can use UnitsEnum just as you would a varchar or an int or a Boolean datatype, using it to define column definitions, using it as a function parameter or return value, etc. For all practical purposes, it’s now built into Postgres:

CREATE TABLE RecipeIngredients
(
  -- Stuff here
  Unit UnitsEnum NOT NULL --Unit column of type UnitsEnum
);

select count(1) from RecipeIngredients where Unit = 'Teaspoon';
--Yay!

select count(1) from RecipeIngredients where Unit = 'Crapload';
--Error is thrown

As far as the user is concerned, the data type appears to be a string that simply errors out when the value is not within a certain allowed set, and in fact strings can implicitly convert to an enum (but not numeric types which I’ve found odd).

Using an ENUM as a column type would be somewhat analogous to creating a text column, but creating a foreign key constraint on a table of measurements which contained the strings in the enumeration above. However, on the heap table the ENUM is actually stored in its numeric representation which could possibly make binary searches a bit faster. One could argue that you could also do the same by using a numeric data type to hold enumerations, and then set CHECK constraints to limit the values. This would be equally nice, however I really like being able to see the string representations of these types when I’m performing ad-hoc queries on my database. This is the best of both worlds.

While this is awesome from a database design perspective, what would be even more awesome is bridging that data type with the business logic code. In other words, I want the models in my ORM to use C# enums that map perfectly to the defined Postgres type.

Luckily, that’s possible to do with Castle ActiveRecord, and actually quite easy.

First, we’d need a C# enum. It would need to match the types in the Postgres ENUM (the names, that is, as the numeric values don’t actually matter here):

public enum Units
{
   //Individual items
   Unit = 0,

   //Volume
   Teaspoon = 1,
   Tablespoon = 2,
   FluidOunce = 3,
   Cup = 4,
   Pint = 5,
   Quart = 6,
   Gallon = 7,

   //Weight
   Gram = 8,
   Ounce = 9,
   Pound = 10,
}

Now, let’s use this in a model called RecipeIngredient, which represents a usage of an ingredient within a certain recipe:

[ActiveRecord("RecipeIngredients")]
public class RecipeIngredient
   : ActiveRecordLinqBase<RecipeIngredient>
{
   // Various column mappings go here
}

First, we’d add a new column mapping called Unit:

private Units unit;

[Property(SqlType = "UnitsEnum", ColumnType = "KPCServer.DB.UnitsMapper, Core")]
public Units Unit
{
   get { return unit; }
   set { unit = value; }
}

We now have a Unit property of type Units, our C# enumeration of valid units of measurements. So, now the model is restricted to only valid units of measurements, using all the type safety of C#.

There’s a few things to look at on the [Property] attribute for this mapping. First is SqlType. This property defines what type in SQL will be used for this column. This will cause ActiveRecord to use this type when it’s writing database creation scripts and what not.

Next, the ColumnType attribute specifies a managed type that’s responsible for type mapping information used by NHibernate. I’m using KPCServer.DB.UnitsMapper, in the Core.dll assembly. Unfortunately, a mapper class can only represent a single ENUM type. Thus, if you have a bunch of enums, you’ll need to repeat a lot of this logic. To DRY this out a bit, I’ve created a helper base class called PgEnumMapper:

public class PgEnumMapper<T> : NHibernate.Type.EnumStringType<T>
{
   public override NHibernate.SqlTypes.SqlType SqlType
   {
      get
      {
         return new NHibernate.SqlTypes.SqlType(DbType.Object);
      }
   }
}

The reason we need to do this is by default, NHibernate will convert C# enums to their numeric representations (which would be much more database agnostic,) attempting to store their value as a number in your database. Postgres will not cast numeric values to an ENUM type, so you’ll get an error trying to do this. The mapper above says, “No, I need this to be a string in the SQL you generate.”

Now, we can use this PgEnumMapper for each of our enums:

public class UnitTypeMapper : PgEnumMapper<UnitType>
{
}

No need to add anything to UnitTypeMapper, it’s perfectly fine the way it is. You’d want to create a PgEnumMapper derived class for each C# enum you want to support in your ORM of course.

That’s all there is to it! You can now use ENUM types in Postgres and magically have them map to C# enums. Wow I love Postgres.

WCF and ActiveRecord, let’s be friends!

Leave a comment

For the coders out there that follow this blog, you’ll probably know the KitchenPC back-end is completely built on Castle ActiveRecord, an easy to use ORM for .NET build on NHibernate.  I’m a huge fan, even though the project is no longer active and most people are probably using Fluent NHibernate or Microsoft’s own ORM, the Entity Framework (blech).  I also happen to be a huge fan of the ActiveRecord pattern in general, so I’m sticking with it and you can’t make me change.

Getting WCF to play nicely with ActiveRecord turned out to be somewhat of a time vampire; not because the solution is hugely complicated, but because it required me to dig in to the inner workings of both WCF and ActiveRecord.  I also found almost zero information out there covering integration between these two technologies, as most blog posts and articles focus on NHibernate itself and may or may not apply to ActiveRecord specific implementation details.  Posting questions on StackOverflow also yielded nothing but cricket noises.

For that reason, I decided to write a technical blog post illustrating exactly how to get these guys to be friends.  Turns out, it’s not all that scary!

Assumptions:

First, I will assume the reader is already family with Castle ActiveRecord, and knows how to define mappings, initialize the framework in Application_Start, and has the basics down of using the framework.  If not, I’d suggest reading up on Castle ActiveRecord, following a few tutorials, and trying it out first on a simple ASP.NET web site.

I’ll also assume you have some basic knowledge of WCF.  However, if you want to be really well versed in the extensibility story for WCF, I highly suggestion reading this article.

ActiveRecord Scopes

NHibernate, for reasons of efficiency, works based on sessions.  To avoid running tons of SQL statements after every line of code you write, the underlying engine will batch things up and run a bunch of statements at once when it’s wise to do so.  It’s also smart enough to know when it needs to re-query data, commit pending transactions first, or invalidate cached data.  SessionScopes can be nested, which means internally they’re represented in a stack of SessionScope objects.

A SessionScope object tracks a collection of pending queries, and commits them to a database (provided everything is kosher) when the object is disposed.  A very simple example of this might be:

   using (new SessionScope())
   {
      Recipe r = Recipe.Find(123);
      r.PrepTime = 60;
      r.Ingredients.Add(new Ingredient(5));

      r.Update();
   }

Here, we find a Recipe object in the database with the ID of 123, set a new prep time for that recipe, then add a new ingredient to the recipe’s ingredient collection. NHibernate, under the covers, is tracking what objects have changed and issuing UPDATE commands through the database provider when SessionScope is disposed.  If we throw an exception halfway through, things we already updated won’t actually get changed, as those UPDATE commands will never be sent.

One thing you’ll notice is that I don’t actually refer to my SessionScope object anywhere. You might be asking yourself how r.Update() knows which session we’re currently in. Well, this is done through an implementation of IThreadScopeInfo. When you call SessionScope.Current, the Castle framework calls upon the configured IThreadScopeInfo which is responsible for finding the current session.

Castle ActiveRecord ships with a few of these. One is called WebThreadScopeInfo and stores the current session stack in the HttpContext.Current.Items[] collection. This means that the session is scoped to each individual HTTP request. Another implementation is HybridWebThreadScopeInfo, which allows you to run sessions in any random ol’ thread, where HttpContext.Current would be null. The HybridWebThreadScopeInfo class will first check HttpContext.Current, and if it’s null, return a session stack keyed to the current thread, creating a new stack if necessary.

This design allows you to call functions which call other functions which call other functions without having to worry about passing sessions and scopes all over the place.

Castle ActiveRecord actually makes this even easier. There exists an HTTP module called SessionScopeWebModule which runs before each HTTP request, creates a new SessionScope for that request, and disposes of it when the HTTP request ends. Using this module, you can write the above code as just:

   Recipe r = Recipe.Find(123);
   r.PrepTime = 60;
   r.Ingredients.Add(new Ingredient(5));

   r.Update();

There’s no need to new up a SessionScope, as one has been created for you by SessionScopeWebModule. Pretty cool, eh?

So why doesn’t this solution work in WCF?

As I pointed out in my previous post, the WCF stack is completely independent of ASP.NET. HTTP modules won’t be run and there’s no HttpContext.Current.Items collection to store the active session stack. For this reason, you’ll need to re-create a few of these solutions in a way that’s compatible with WCF’s architecture.  You can also run in aspNetCompatibilityEnabled mode, but that has its own set of limitations.

Unfortunately, Castle ActiveRecord doesn’t have support for this out of the box, but it offers the extensibility to design a solution similar to the ASP.NET handlers. In fact, I based my code completely off the built in SessionScopeWebModule HTTP module and HybridWebThreadScopeInfo scope info class.

Step 1: Creating a scope for each WCF request

In ASP.NET, this would be done using an httpModule.  In WCF, this is done by implementing a message inspector.  A message inspector, which is an implementation of IDispatchMessageInspector, is a class that can look at a message traveling through the WCF pipeline and do things before and after the message is processed.  It’s dead simple, and has an AfterReceiveRequest method which is called after the request is received and deserialized, and a BeforeSendReply which is called before the reply is assembled.  We can use this to create a new session scope for our WCF operations.

public class MyInspector : IDispatchMessageInspector
{
   public MyInspector()
   {
   }

   public object AfterReceiveRequest(ref System.ServiceModel.Channels.Message request, System.ServiceModel.IClientChannel channel, System.ServiceModel.InstanceContext instanceContext)
   {
      return new SessionScope();
   }

   public void BeforeSendReply(ref System.ServiceModel.Channels.Message reply, object correlationState)
   {
      SessionScope scope = correlationState as SessionScope;

      if (scope != null)
      {
         scope.Dispose();
      }
   }
}

Ok so what’s going on here? When we receive a request, we new up a SessionScope object. The constructor actually adds itself to the current session stack through the configured IThreadScopeInfo, so all that is taken care of for us. This is actually quite similar to what the web module does, only the web module adds that SessionScope object to the HttpContext.Current.Items collection so the scope can be disposed of later.  We don’t need to that because we take advantage of correlation state.  Any object AfterReceiveRequest returns will be passed in to BeforeSendReply, which is useful for tracking objects in a message inspector through the lifespan of a WCF request.

You’ll notice that we use this concept in BeforeSendReply to grab the reference to the SessionScope created above and dispose of it, thus committing any pending transactions to the database.

This inspector now has to be attached to a WCF service. This can be done in a variety of ways (mostly mucking with your web.config), but perhaps the easiest is by implementing a custom IServiceBehavior. An IServiceBehavior defines a custom behavior for a service, such as adding custom message inspectors. A very cool feature of this is your IServiceBehavior can derive from the .NET Attribute class which allows you to tag a WCF service directly with a behavior, no need to mess around with configuration files. This class is quite simple to implement:

public class MyServiceBehaviorAttribute : Attribute, IServiceBehavior
{
   public void AddBindingParameters(ServiceDescription serviceDescription, System.ServiceModel.ServiceHostBase serviceHostBase, System.Collections.ObjectModel.Collection<ServiceEndpoint> endpoints, System.ServiceModel.Channels.BindingParameterCollection bindingParameters)
   {
   }

   public void ApplyDispatchBehavior(ServiceDescription serviceDescription, System.ServiceModel.ServiceHostBase serviceHostBase)
   {
      foreach (ChannelDispatcher cDispatcher in serviceHostBase.ChannelDispatchers)
         foreach (EndpointDispatcher eDispatcher in cDispatcher.Endpoints)
            eDispatcher.DispatchRuntime.MessageInspectors.Add(new MyInspector());

   }

   public void Validate(ServiceDescription serviceDescription, ServiceHostBase serviceHostBase)
   {
   }
}

Nothing to really explain here. When the behavior is applied, it goes through all the endpoints that connect to that service and applies our message inspector. You can then tag your WCF service with that behavior:

[MyServiceBehavior]
public class Service1 : IService1
{
   // Your operation contracts here
}

When we have this hooked up, every request to your service will be gauranteed to have its own session scope, so no need to new up a SessionScope object yourself. It’ll work just like ASP.NET using the SessionScopeWebModule module.

Well crap, that won’t actually work…

If you’re smart and/or know a lot about IIS, you’ll see a major problem with this. WCF exhibits thread agility, which mean the message inspector (where our SessionScope was created) and the service operation (where we’ll be looking for SessionScope.Current) could be different threads!

As I mentioned before, the current SessionScope is resolved through a IThreadScopeInfo implementation. The WebThreadScopeInfo implementation uses HttpContext.Current to track the session stack, providing an immunity against ASP.NET thread agility. The HybridWebThreadScopeInfo implementation attempts to use HttpContext.Current, but if it’s not found it basically creates a stack keyed by the local thread. This kinda works for simple cases, like loading some data into memory within an initialization thread, but probably isn’t very smart for us. If we need to lazy load some property and a session scope had never been created on that thread, we’re going to crash with the exception: “failed to lazily initialize a collection of role: xxx, no session or session was closed”.  This would of course happen intermittently, and only in the middle of the night or on your vacation.  You’ll never be able to reproduce it on your dev box, and it would make you hate the world and yearn for a simpler time before all this technology.

The solution? Write a super-duper-hybrid session scope info class. One that does it all. This thing of beauty will check the HttpContext.Current if we’re running on an ASP.NET page, check the WCF OperationContext.Current if we’re in an WCF operation, and then fall back to a stack keyed by the thread if all else fails. It will slice, it will dice, you’ll be seeing infomercials for this IThreadScopeInfo. And I won’t charge you three payments of $19.95 to use it. In fact, I’ll give you the code for free.

This implementation is heavily based on the HybridWebThreadScopeInfo implementation, which I recommend skimming over first. I just added a bit of extra logic to handle the WCF case.

public class WcfThreadScopeInfo : AbstractThreadScopeInfo, IWebThreadScopeInfo
{
   const string ActiveRecordCurrentStack = "activerecord.currentstack";

   [ThreadStatic]
   static Stack stack;

   public override Stack CurrentStack
   {
      [MethodImpl(MethodImplOptions.Synchronized)]
      get
      {
         Stack contextstack;

         if (HttpContext.Current != null) //We're running in an ASP.NET context
         {
            contextstack = HttpContext.Current.Items[ActiveRecordCurrentStack] as Stack;
            if (contextstack == null)
            {
               contextstack = new Stack();
               HttpContext.Current.Items[ActiveRecordCurrentStack] = contextstack;
            }

            return contextstack;
         }

         if (OperationContext.Current != null) //We're running in a WCF context
         {
            NHibernateContextManager ctxMgr = OperationContext.Current.InstanceContext.Extensions.Find<NHibernateContextManager>();
            if (ctxMgr == null)
            {
               ctxMgr = new NHibernateContextManager();
               OperationContext.Current.InstanceContext.Extensions.Add(ctxMgr);
            }

            return ctxMgr.ContextStack;
         }

         //Working in some random thread
         if (stack == null)
         {
            stack = new Stack();
         }

         return stack;
      }
   }
}

At first glance, it looks similar to the HybridWebThreadScopeInfo code, because, well, it is. It has a ThreadStatic Stack field, it checks the HttpContext.Current first, blah blah. The difference is the part in the middle which checks OperationContext.Current. This property will have a value if we’re running within the WCF pipeline. An OperationContext is analogous to HttpContext, but only for the WCF world. Similar to HttpContext.Current.Items, we can even store random junk we want to have for the lifespan of the operation. However, rather than just being a simple key/value pair like .Items is, they had to make it a bit more complicated and use something called Extensions.

An extension is any class that derives from IExtension<T>. Our extension, of course, needs to store the current SessionScope stack. Implementing it is pretty straight forward.

public class NHibernateContextManager : IExtension<InstanceContext>
{
   public Stack ContextStack { get; private set; }

   public NHibernateContextManager()
   {
      this.ContextStack = new Stack();
   }

   public void Attach(InstanceContext owner)
   {
   }

   public void Detach(InstanceContext owner)
   {
   }
}

You’ll notice IExtension<T> has Attach<T> and Detach<T> methods that are called by WCF when an extension is added or removed from the operation context, which we need to implement with dummy methods to satisfy the interface.

We can configure ActiveRecord to use this awesome new IThreadScopeInfo implementation within web.config:

<activerecord
   isWeb="true"
   threadinfotype="WcfThreadScopeInfo, Website">

Or if you’re using an InPlaceConfigurationSource and initializing ActiveRecord in your global.asax, you can use:

InPlaceConfigurationSource source = new InPlaceConfigurationSource();
// Stuff
source.ThreadScopeInfoImplementation = typeof(WcfThreadScopeInfo);

This will tell ActiveRecord to use WcfThreadScopeInfo to resolve the current session scope any time it needs to know.

In Summary…

Ok so what’s going on now? When a WCF request comes in, the IDispatchMessageInspector message inspector is run, which creates a new ActiveRecord scope for that request. The constructor for SessionScope will see if there’s already a SessionScope stack to add itself to, and it will do so by looking at the configured IThreadScopeInfo implementation. Our implementation will be able to use a ASP.NET context, a WCF context, or a thread context to store this stack in. This means that any time we need to look up the current session, which is done all over the place in the ActiveRecord framework, we’ll be able to find the current session scope or create one if necessary. When the request is ended, the session scope is disposed of and pending updates are committed to the database. If you had new’ed up any child session scopes on the stack, you’d of course be responsible for disposing of those properly as well.

Well, there you have it. Making WCF and ActiveRecord best friends is fairly straight forward, and lets you learn about the inner workings of NHibernate, ActiveRecord, and the WCF pipeline all at the same time. Have fun!

One does not simply… switch to WCF

Leave a comment

In my last blog post, I talked about the nature of new code, and how it sometimes spawned a spontaneous redesign of existing architecture to avoid future technical debt.  In my case, laying the initial foundation for mobile and tablet apps caused me to rethink my web service layer.  Since then, I’ve been spending a lot of time going through the KitchenPC API and making sure everything is solid, well designed, consistent and flexible.  After all, once mobile and tablet apps have been built, it will be difficult to change the interface contracts.

Unfortunately, I seem to have fallen down a rabbit hole and signed up for a pretty extensive re-design of the KitchenPC back-end.  I ran into conflicts between my ideal API design and the limitations of ASP.NET Web Services.  In the end, I decided to upgrade the KitchenPC web service layer to use a more modern framework, Windows Communication Foundation (WCF).

31335440Turns out, one does not simply… switch to WCF.

On the outside, ASP.NET Web Services (using either SOAP or JSON) and a WCF service are similar in nature, especially when running under IIS.  An ASP.NET Web Service is an .ASPX file that points to an implementation of a class derived from System.Web.Services.WebService.  Publicly exposed methods of this class are tagged with [WebMethod] and accept and return serializable data types.  ASP.NET is able to generate a JavaScript proxy on the fly, allowing client-side browser code easy access.  In WCF, an .SVC file points to a Service contract, which is any class that’s tagged with the [ServiceContract] attribute.  Often, this is done by implementing an interface tagged with said attribute.

The fact that an ASP.NET Web Service extends a common base class, WebService, and a WCF Service does not, is an important consideration when porting existing code.  Any methods that depend on the ASP.NET request pipeline, such as references to Context, Session, or Application, will need to be reconsidered.  This means you cannot depend on things like cookies,  HTTP headers, IP addresses, identity management, and other things users of the ASP.NET framework take for granted.

pigeon_with_cookieTo understand why, let’s take a step back and look at the design behind WCF.  WCF was implemented as a protocol agnostic communications framework.  Nowhere does WCF assume that HTTP is being used as a transport mechanism over the wire.  One could be using raw TCP/IP, or perhaps named pipes, shared memory, or a USB controlled carrier pigeon launcher.  Nowhere does WCF assume it’s even running on a web server like IIS.  A WCF endpoint could be running within a console application, or a Windows Service.  Many of these architectures have no concept of things like HTTP headers, session state management or other things associated with the web world.  Though a pigeon could probably carry a cookie if it were only a few bites.

For this reason, within the context of a WCF operation, you’re working with very generalized notions of communication.  I was definitely bitten by this “limitation” while porting my web services over, as I use session cookies to communicate authentication information used for user specific service calls.

Those who want to quickly port their web services over to WCF will, however, be pleased to know that the designers of WCF foresaw this radical paradigm shift and took pity.  There exists a compatibility mode that bridges the ASP.NET pipeline with WCF, and permits information sharing between those two very different worlds.  This compatibility mode can be enabled in your web.config file by using:

<system.serviceModel>
   <serviceHostingEnvironment aspNetCompatibilityEnabled="true" ... />
</system.serviceModel>

When running in this mode, you’ll be able to access things such as HttpContext.Current and session state.  If you’re interested in how this works under the covers, there’s a great blog post here.  This is most definitely an approach for those building services that never need to run outside the context of a web server or support protocols other than HTTP.  However, I decided to really dive into the WCF world and design a fully independent KitchenPC API and not take the easy way out.  Either that, or the fact that I couldn’t get ASP.NET compatibility working under IIS when running in Integrated Pipeline mode and I got frustrated.  But let’s go with the first thing, it sounds more noble.

Luckily, getting rid of cookie dependencies was fairly straight forward.  My web services already took a ticket string which contained a serialized authentication ticket.  However, this parameter was optional if the HTTP header had a cookie value set containing the same information, as I didn’t want to pass in that potentially long ticket twice in each HTTP request.  The ticket was also never returned back with the Logon web method, since it just set the cookie in the HTTP response.  The new design requires a session ticket to be supplied on sensitive web service calls, and it’s up to the client to store that ticket for future calls.  In my mind, this is simpler and more straight forward.  Web, mobile and tablet will all call the API in the same way, and enforcing these parameters becomes less complicated in terms of code.

You might be asking me, at this point, what the actual benefit of switching to WCF is.  After all, the web services worked just fine before, right?  Do I really need a protocol agnostic KitchenPC service, when in reality HTTP will be the primary, if not only, means of access?  Well, the answer is flexibility.  Massive amounts of flexibility.  Every single last component of WCF is fully customizable or replaceable, from message inspection, to fault handling, to serialization.  Don’t like something about how your service works?  Change it.  And that’s where WCF starts to get extremely complicated, as I’ve been learning over the past few days.  That’s also why I’ve decided to write a couple more WCF posts, summarizing how I’ve tweaked WCF to build the KitchenPC web service I’ve always wanted.

Stay tuned!

Without Exception

2 Comments

Part of being an entrepreneur is getting used to the fact that work is infinite in nature.  Actually accomplishing something doesn’t reduce the number of items in your to-do list, it simply creates more work.  The more productive you are, the more work will be created, causing an annoying feedback loop similar to putting a microphone too close to the speaker.

I was reminded of this over the long Thanksgiving weekend while hacking together the first few k-locks of code that will ultimately become the common mobile library for KitchenPC.  This is a UI agnostic API that will eventually power things like Windows Phone, Surface, Windows 8 apps, and other devices.

The main function of this library is to abstract client-to-server communications between a mobile or tablet device and the server.  It works similar to a web service proxy, like one that Visual Studio will generate for you from a WSDL contract.  However, I wanted to write one from scratch because, one, that’s the way I roll, and two, I can write a version that’s faster, more portable, and more light weight.

It’s written on the CoreCLR (which has been ported to various platforms, including iOS and Android) which provides only a subset of the raw power of the full .NET Framework.  I think developing a core KitchenPC framework on this platform can ease development for KitchenPC apps on various devices in the future, and also some day evolve into a full fledged open-source KitchenPC SDK.  I wanted to make the code pure, without any outside dependencies, and as portable, easy to manage and distribute as possible.

While basic WCF libraries are available on the CoreCLR, they’re prone to a few problems.  One, they only implement SOAP transports.  While KitchenPC can speak SOAP, the Javascript libraries use JSON to communicate with the server, which is much less verbose and efficient.  Two, my initial attempts to get that code running on the Mono framework were far from successful.  Even Xamarin has marked the WCF Stack as Alpha Quality.  If I wanted something truly efficient, truly cross-platform, and truly low-level, I was just going to have to write it myself.  So that’s what I did.

So, now you’re aware of the amount of work I signed up for, here comes the part where it spawned into more work, like crazed bunnies in the spring.

The KitchenPC web services are built with the idea in mind that return values should always be valid and indicate a successful result.  Errors should be handled as exceptions, which are serialized through SOAP faults or JSON exceptions.  This was the design decision I made, simply because it seemed cleaner to me.  I could just throw exceptions at any place in the stack, without having to litter try/catch blocks all over the place.  I’ve used a bunch of web-based APIs that return things like Result, which have a Success property indicating if the call was successful, an Error property with error information, etc.  The Exchange Server API is designed in this manner.

Well, turns out, when the ASP.NET Script Services throw an exception and that exception information is serialized out to the client, the resulting HTTP code is 500 (Internal Server Error).  This probably makes sense, as you’d want to be able to trap that code and handle that condition as an error.  In HTTP terms, the result looks like this:

HTTP/1.1 500 Internal Server Error
Cache-Control: private
Content-Type: application/json; charset=utf-8
Server: Microsoft-IIS/7.5
jsonerror: true
X-AspNet-Version: 4.0.30319
X-Powered-By: ASP.NET
Date: Sat, 24 Nov 2012 22:42:33 GMT
Content-Length: 91

-- JSON serialized version of your Exception object

Now, with the CoreCLR WebClient libraries, to my knowledge (and after many hours of trying), there is simply no way to actually get access to that serialized JSON exception in the response.  Once WebClient sees a HTTP 500, it throws a generic WebException.  Basically, you’d do something like:

WebClient client = new WebClient();
client.Headers["Content-Type"] = "application/json; charset=UTF-8";

client.UploadStringCompleted += delegate(object sender, UploadStringCompletedEventArgs e)
{
   if (e.Error != null)
   {
      // e.Error will be a generic WebException, with nothing helpful of any sort
   }

   // On a successful call, e.Result will contain the HTTP response
};

client.UploadStringAsync(new Uri(url), json);

If the web service call failed, an HTTP 500 would be returned, which would cause e.Error to be set to a generic WebException object. The Message property would say something like “NotFound”, or something unhelpful. If you tried to access e.Result, you’d get another exception. Great.  They made it as hard as possible to do anything low-level with this API.

I tried every way I could think of to intercept the actual HTTP response before an exception is thrown (writing all sorts of really messy code,) but the framework was just not having it. It’s no wonder so many applications say things like “An error has occurred. Please try again later.” – It’s bloody impossible to ever get the information you need!

I suppose the only way around it would be to re-implement the HTTP protocol at the socket level, but who has that kind of time?

Without access to this exception information, I could not do things like tell the user their password was wrong, or if a menu could not be created and why.  Sure, I could guess what the error most likely was, but that’s lame.

Even if there were a way around this, I ran into another issue.  Through the course of debugging this mess, I discovered a major bug on KitchenPC that only repros on production.  I wasn’t even getting back exception data even if I were able to parse it.  The HTTP response looked like:

{"Message":"There was an error processing the request.","StackTrace":"","ExceptionType":""}

That’s right, the production KitchenPC server was eating exception information, which I rely on for various site UI functionality (such as logon errors, duplicate menu names, registration issues, etc)!  All this code was broken, and I hadn’t even noticed since I do very little testing in production.

Turns out, ASP.NET will only serialize exception information if you have <customErrors mode="Off" /> set in web.config, and debug mode enabled. This, of course, exposes all sorts of other debug information with site crashes as well, which you may or may not want your users to have access to. There doesn’t appear to be a way to allow ASP.NET web services to serialize certain exceptions, without running in debug mode, since the creators of ASP.NET saw this as sensitive data, and definitely not something you’d be depending on for your UI to function properly!  Uh oh.

During the KitchenPC beta, I just ran the code in debug mode as I actually wanted detailed errors to be displayed, logged, etc.  It was a beta product, I needed to log things, trap things, and if something happened, I wanted users to be able to report as much information as they could on it.  Thus, I never noticed this design was potentially flawed in production use.  I don’t think it was until I switched to .NET 4.5 and started using web.config transforms that I really had a solid production configuration.  So, many of the error handling features have been broken on my site for the past few weeks.  Sigh.

I knew the only approach was to re-think how exceptions are handled.  First, exceptions should be exceptional.  An exception should be thrown if and only if that condition was completely unexpected.  Things like bad passwords and duplicate user names are perfectly reasonable, and thus should be communicated with a valid return object and an HTTP 200 result.

I then spent all of Sunday night revisiting every single API, ripping out the exception throwing code and implementing a new base return type which has properties such as Error.  Ugh, I’ve now implemented the exact design I said at the beginning of this post that I hated so much, but I guess that’s how things must work.

It’s of course very important to get all these API issues worked out now rather than later.  Once I release mobile and tablet apps, changing APIs can be quite a hassle.  With web code, I can simply change the JavaScript files and invalidate the CDN cache when I deploy changes.  With mobile applets, I really don’t want to have to force users to upgrade to a new version to continue running their app.

These new changes have been rolled out to production, so there should be no more missing error messages when you enter the wrong password.  Now, all web service calls return either a Result object, or an object that inherits from Result.  This makes the KitchenPC API much more consistent, and allows me to handle result errors internally within my proxy so that I can throw exceptions when the result contains an error.

I think overall, it was a good redesign.  Now, maybe I can actually get some work done.

A Better Way To Deploy (Technical Debt Part 5/3)

Leave a comment

During the last month, I’ve been working on paying off some of the technical debt that has been collecting interest over the course of KitchenPC’s development. Most of these have been stability changes, or just things that, when fixed, remove roadblocks to a smoother development process. I’d like to go over these in a three part blog series that, like the Hitchhikers Guide to the Galaxy trilogy, will consist of five parts.

Part 5/3: A Better Way To Deploy

During the beta, I was more focused on code and features than boring process and tools.  One of the things I never got around to was any sort of modern deployment system to push new changes out to production.  This wasn’t a huge deal, since I only had a single web server and didn’t have enough traffic to really care if I had to bring down the site for a couple minutes.

The original technique for pushing out changes is rather embarrassing, but I’ll share it with you anyway.  I’d go out on a limb and say there are plenty of entrepreneurs who do the exact same thing.

First, I ran MSBuild (I had previously used nAnt but switched to MSBuild for a variety of reasons) to build all the individual KitchenPC projects.  The MSBuild script would also crunch the CSS and JavaScript files using the YUI Compressor.  Everything would be copied out to a directory called /Build which had a subdirectory for binaries, a subdirectory for site content, and a subdirectory for the KitchenPC Queue service which also runs on the web server.  Building the entire site would take around 20 seconds.

Next, I’d zip up everything.  If I was making a lot of changes to the site, I’d just zip up the entire Build directory.  However, sometimes I was just making some script changes or binary changes, and would just zip up the files that were changed.  Then, I’d Remote Desktop (we need a better transitive verb for that) into the production web server, and transfer my files over.  Originally, I put them up on my own private web server running off my home network and just use wget to download the file, but later on I started using FTP instead and manually uploaded the files to the web server.

From the desktop, I’d unzip the package and copy out the files, usually running an iisreset when everything was in place.

So, in other words, everything was manual.  It would usually take five or ten minutes to get everything deployed.

After launch, I finally had time to think about this process and devise a better solution for pushing changes out.  Ideally, I wanted to do this without having to Remote Desktop into production web servers, and also be able to push changes out with a single command line tool.  A bonus would be the ability to not have to take the site down, lose any traffic, or reset IIS when there were no binary changes being deployed.

On the Microsoft technology stack, WebDeploy is the cutting edge solution for doing all of the above, and more.

WebDeploy is a protocol that’s able to compare a set of files on a source and destination, and synchronize those files automatically.  It can be used for synchronizing files between two web servers (making the provisioning of more web servers extremely efficient) or synchronizing files between a development or staging environment and a production web server.  It could also be used to create backups of a web server and its configuration by dumping an entire IIS application to a ZIP package, or a raw directory.  It’s far superior to FTP, as it’s smart enough to compare the timestamps on files to figure out exactly what has changed between a source and destination, and only transfer that content over the wire. WebDeploy is used on Windows Azure, as well as the Microsoft WebMatrix platform, and also integrated with recent versions of Visual Studio. Basically, WebDeploy is where it’s at.

WebDeploy runs as an IIS extension, and can be installed easily using the Microsoft Web Platform Installer.  When installed on a machine without IIS (such as a development box,) only the client side tools (such as MSPublish.exe) are installed so deploying changes from the command line to a remote IIS box can be accomplished.  Also, Visual Studio 2010 and above will install MSPublish.exe automatically, since the Publish feature in Visual Studio calls MSPublish.exe under the covers when publishing a site using WebDeploy.

As it would not make a very good blog post to walk you through setting up WebDeploy on your web servers, I’ll refer you to this page which will walk you through the steps.

If you’re running Visual Studio 2010 or 2012, setting up Web Deploy on your web server is really all you need to do.  Once you configure WebDeploy on a site, it will export a settings file which can be used directly in Visual Studio.  However, since I’m a command line freak (plus, I was still running VS2008 at the time,) I decided to integrate WebDeploy into my MSBuild scripts instead, and then be able to build and deploy changes in a single command.

To use MSDeploy from within an MSBuild file, you’ll first want to setup some properties within your project.  These properties should be defined directly under the <Project> tag so they’ll be global to all tasks.  I define a few properties to hold various publishing settings, and support the ability to publish to both a staging and production web server:

  <PropertyGroup>
    <ArchiveDir>$(MSBuildProjectDirectory)\$(BuildDir)</ArchiveDir>
    <AppHost>KitchenPC</AppHost>
    <UserName>publish_acct</UserName>
    <Password>secret</Password>
  </PropertyGroup>

  <Choose>
    <When Condition="'$(Configuration)'=='Staging'">
      <PropertyGroup>
        <PublishServer>https://staging</PublishServer>
      </PropertyGroup>
    </When>
    <When Condition="'$(Configuration)'=='Release'">
      <PropertyGroup>
        <PublishServer>https://www.kitchenpc.com</PublishServer>
      </PropertyGroup>
    </When>
  </Choose>

Let’s go through each of these properties:

ArchiveDir: For me, this is simply where I build my site to.  The directory structure needs to be identical to the way MSPublish exports your site, so you can run:

msdeploy -verb:sync -source:appHostConfig=”Default Web Site”,computerName=server -dest:archiveDir=c:\Test

To dump your site to a temporary directory on your hard drive to see.  Replace Default Web Site with the name of your site in IIS, and replace server with the computer name or IP address of the web server.

For me, it puts all the files in /Content/c_C/Website but this might be different for you.  It’s also possible MSDeploy has the ability to define exactly how the folder layout should look, but I’m not yet enough of an expert to know.  Also note that Visual Studio will do all this packaging for you.

AppHost: This is the name of your application in IIS.  This is probably Default Web Site if you haven’t mucked with the default IIS settings, but for me this is KitchenPC.

UserName: This is the username MSPublish will try to connect to your web server with.  This might be Administrator if you haven’t granted anyone else access, but it’s advisable to setup a user account just for publishing.

Password: This is the password for the username above.

It may or may not be necessary to hard code all these publishing settings into your MSBuild script.  Visual Studio is able to store these settings in a settings file, but from what I can tell, those settings are extracted and passed directly into MSDeploy.exe when you publish.

I also have a conditional property called PublishServer which changes depending on the build configuration.  If I’m building the Staging configuration, I’ll publish to my local staging server.  If I’m building in Release mode, I’ll deploy to production.

Next, you’ll need to define a new Target for your deploy task:

  <Target Name="Publish" DependsOnTargets="Build">
    <Message Importance="High" Text="Deploying to $(PublishServer)" />

    <Exec
      WorkingDirectory="C:\Program Files\IIS\Microsoft Web Deploy V3\"
      Command="msdeploy.exe -verb:sync -source:archiveDir=$(ArchiveDir) -dest:appHostConfig=$(AppHost),computerName=$(PublishServer):8172/msdeploy.axd,authType=Basic,userName=$(UserName),password='$(Password)' -allowUntrusted" />
  </Target>

Notice this target depends on the Build task, which compiles the site with the proper configuration, crunches all the CSS and JavaScript files, and moves everything to the correct output directory using the right folder hierarchy.

It then simply executes the msdeploy.exe command with the appropriate command line arguments. Note if you’re using a self-signed SSL certificate on your server (which the Web Platform Installer will create by default), you’ll have to use the -allowUntrusted command line parameter as well. Hopefully, with a bit of tweaking, this will get you started on your own solution!

What About Web.config files?

Glad you asked!  Another very cool thing about Visual Studio 2010 and above are web.config transforms.  In the past, I just had separate files for various web.config files (such as prod.web.config and staging.web.config), and my MSBuild script would copy the correct one to the build output directory.  When I zipped up the site, the production web.config would be included.

The correct way to do this is using a transformation, which is basically an XSLT file that contains instructions on transforming your default web.config file to a file suitable for a specific environment.  Visual Studio takes care of all of this for you if you’re using a Web Application Project, allowing you to right click on your web.config file and add a new configuration.

However, the actual code for doing these transforms is neatly packaged up in managed libraries, and can be accessed as an MSBuild task.  Since, again, I’m a command line freak this is the approach I took.

The first step is to reference the Microsoft.Web.Publishing.Tasks.dll assembly in your MSBuild script.  This can be done by adding the following line directly under the <Project> node:

  <UsingTask TaskName="TransformXml"
    AssemblyFile="C:\Program Files (x86)\MSBuild\Microsoft\VisualStudio\v11.0\Web\Microsoft.Web.Publishing.Tasks.dll"/>

I just hard coded in the path, but you could easily make this an environment variable, or copy the DLL to a more appropriate directory.

Once this assembly is referenced, you can start using build tasks defined within that library. This is ridiculously easy:

    <TransformXml
      Source="KPCServer/web.config"
      Transform="KPCServer/web.$(Configuration).config"
      Destination="$(DeployDir)/web.config" />

Basically, we take the default web.config checked into your source tree, run the appropriate transformation against it (such as web.staging.config or web.release.config) and output the file to the correct output directory; in my case, the deployment directory that MSPublish picks up.

I won’t get into the details about building these transformations, since there’s tons of documentation online.

Once this task is added into your build target, then we’ve nearly got a complete build system in place.  Running the deploy target will first build your site, copy and transform your web.config file over, and run MSDeploy.exe to publish your site (copying only the files that have changed, of course!)

There’s one more thing we want to do.  Minimize CSS and JavaScript files.

Crunch Time!

Previously, I used YUI Compressor, a Java utility developed by Yahoo! to minimize JavaScript and CSS files.  It’s fantastic and pretty much the standard in the industry.  I would just execute this utility directly from an MSBuild script using an Exec task and call it good.

However, this time I decided to try out the .NET port of YUI Compressor, which is called YUICompressor.net.  The author claims the algorithm is identical, and the results should be nearly byte-for-byte the same.  It can be easily installed with NuGet and it also has MSBuild tasks included.

One drawback is the documentation is wildly out of date!  I had to spend a lot of time in ildasm digging through the assembly to figure out what was going on.  Hopefully, I can provide some better documentation here for people who are attempting to integrate YUI Compressor .NET into their MSBuild scripts.

The first step is to reference the build tasks.  There’s two of them, one for JavaScript minification and one for CSS files:

  <UsingTask
    TaskName="JavaScriptCompressorTask"
    AssemblyFile="packages\YUICompressor.NET.MSBuild.2.1.1.0\lib\NET20\Yahoo.Yui.Compressor.Build.MsBuild.dll" />
  <UsingTask
    TaskName="CssCompressorTask"
    AssemblyFile="packages\YUICompressor.NET.MSBuild.2.1.1.0\lib\NET20\Yahoo.Yui.Compressor.Build.MsBuild.dll" />

These directories happen to be where NuGet installed the binaries, but you can probably put them wherever.

I then made two separate MSBuild tasks, one for JavaScript and one for CSS.  We’ll go through the JavaScript task first:

  <Target Name="CrunchJs">
    <!-- Select all kpc*.js files -->
    <CreateItem Include="$(DeployDir)/scripts/kpc*.js">
      <Output TaskParameter="Include" ItemName="JsCrunch" />
    </CreateItem>

    <!-- Copy them to .debug files since JavaScriptCompressorTask cannot read and write to the same file -->
    <Copy SourceFiles="@(JsCrunch)" DestinationFiles="@(JsCrunch->'%(Identity).debug')" />

    <!-- Compress the .debug files, and delete them afterwards -->
    <JavaScriptCompressorTask
      SourceFiles="%(JsCrunch.Identity).debug"
      ObfuscateJavaScript="True"
      EncodingType="Default"
      LineBreakPosition="-1"
      LoggingType="None"
      DeleteSourceFiles="Yes"
      OutputFile="$(DeployDir)/scripts/%(Filename)%(Extension)"
      />
  </Target>

First, we select all the files we want to crunch. In my case, this is anything in the output directory that matches kpc*.js (as I give all my script files a kpc prefix). Next, we copy each file to a new file with .debug appended to the end. For example, we copy kpccore.js to kpccore.js.debug and kpchome.js to kpchome.js.debug. We do this because YUI Compressor .NET is not capable of using the same filename for both the input and output files, and I want to simply compress all the files in my build directory.

Lastly, we run the JavaScriptCompressorTask task. Note YUI Compressor .NET can only handle a single output file. While it can handle an enumeration of items for the SourceFiles parameter, it can only output them as a single file. For example, I can pass in kpccore.js, kpchome.js, kpcmenu.js, and all my other files as the SourceFiles parameter. However, it will not output separate crunched files. It will output everything to a combined file, such as kpcall.js. I don’t really like this, since I don’t need to include kpchome.js on any page besides the home page, or kpcqueue.js on any page besides the queue page.

For this reason, I use MSBuild batching to call the JavaScriptCompressorTask task multiple times, one for each item in JsCrunch. This is indicated using the percent sign prefix, %(JsCrunch) rather than @(JsCrunch), which would send in all the files as a single input parameter.

So, I run the JavaScriptCompressorTask on each file within the JsCrunch item set, appending the .debug prefix to the end since the input file can’t be the same as the output file. I also add the DeleteSourceFiles parameter to delete the temporary .debug files after I’m done.

The CSS task looks very similar:

  <Target Name="CrunchCss">
    <!-- Select all *.css files -->
    <CreateItem Include="$(DeployDir)/styles/*.css">
      <Output TaskParameter="Include" ItemName="CssCrunch" />
    </CreateItem>

    <!-- Copy them to .debug files since CssCompressorTask cannot read and write to the same file -->
    <Copy SourceFiles="@(CssCrunch)" DestinationFiles="@(CssCrunch->'%(Identity).debug')" />

    <!-- Compress the .debug files, and delete them afterwards -->
    <CssCompressorTask
      SourceFiles="%(CssCrunch.Identity).debug"
      EncodingType="Default"
      LineBreakPosition="-1"
      LoggingType="None"
      DeleteSourceFiles="Yes"
      OutputFile="$(DeployDir)/styles/%(Filename)%(Extension)"
      />
  </Target>

I’ll let you figure that one out, since it’s basically a copy of the JavaScript task, only using the CssCompressorTask task instead.

The last step is running this target after I build. I add the following lines to the end of my Build target:

    <!-- Crunch Files -->
    <Message Importance="High" Text="Crunching JavaScript/CSS Files..." />
    <CallTarget Targets="CrunchJS;CrunchCSS" />

This basically instructs the Build target to call both the CrunchJS and CrunchCSS targets after it finishes, minimizing the JavaScript and CSS files that have been copied to the build output directory.

Wrapping Everything Up in a Bow:

Since I use PowerShell for my build environment, I’ve also created a new PowerShell function to build and deploy the site to either staging or production:

function Publish
{
   if ($args.Count -gt 0)
   {
     switch ($args[0].ToUpper())
     {
        "STAGE"   { msbuild .\build.xml /t:Publish /p:Configuration=Staging /verbosity:minimal }
        "RELEASE" { msbuild .\build.xml /t:Publish /p:Configuration=Release /verbosity:minimal }
     }
   }
   else
   {
      Write-Host "Please specify either STAGE or RELEASE build.`n"
   }
}

Now, I can simply type Publish Stage or Publish Release at the command prompt to run the Publish target in build.xml.  This will build everything in the appropriate configuration, transform the correct web.config and copy that to the deployment directory, minify everything using YUI Compressor .NET, and then publish that whole package to the correct server using MSDeploy.exe.  Magic!

Hopefully some of this has inspired you to take a look at your current deployment strategy and think about some improvements.  What I have is far from perfect, but it was a solid weekend of work and far superior to what I had before.  Combined with the fact that my app starts nearly immediately now, and the fact that only modified files are copied out, I can now deploy new versions of the site extremely quickly without losing any traffic.  Requests that are coming in while I deploy would probably just see a couple second lag while the new binaries are loaded and the app pool recycles.

Out with the old, in with the new (Technical Debt Part 4/3)

Leave a comment

During the last month, I’ve been working on paying off some of the technical debt that has been collecting interest over the course of KitchenPC’s development. Most of these have been stability changes, or just things that, when fixed, remove roadblocks to a smoother development process. I’d like to go over these in a three part blog series that, like the Hitchhikers Guide to the Galaxy trilogy, will consist of five parts.

Part 4/3: Out with the old, in with the new

I’ve been using Visual Studio 2008 since the very beginning of KitchenPC.  It’s fast, it’s stable, it works.  When Visual Studio 2010 came along, I used it briefly but just couldn’t convince myself to make the switch.

Visual Studio 2010 suffered through a similar development cycle as Project Server 2007, the application I worked on at Microsoft for a solid chunk of my career there.  Previously, Project Server was written as an ASP website (Yes, ASP, no not ASP.NET) and had a very mature code base mostly built in JScript (Yes, JScript, no not VBScript.)  We decided to rewrite the entire server in managed code, while at the same time, cramming as many new bells and whistles into the product as we possible could.  We decided to do all of this during the Office 2007 time frame, a two-year release cycle where most products weren’t changing anything huge, and certainly not being rewritten from scratch.

The result?  A giant disaster.  The product was unstable, buggy, most innovative features ended up getting cut anyway, and the new features that did go in were poorly thought out and implemented.  On the plus side, we had a brand new, more modern code base.  Albeit, that code base took a lot of shortcuts, ensuring that down the road, backwards compatibility would come at the cost of being stuck with a poorly laid out set of APIs.  The amount of technical debt accrued during that release would make China shudder.

Visual Studio 2010 went through a similar release.  The product was rewritten as a WPF app using managed code.  They brought in two architects, both of whom came up with wildly unrealistic estimates for what could be done.  They spent years on this redesign, getting nowhere.  The result was a release that was unstable, buggy, slow, and really didn’t do anything new besides its support for the .NET 4.0 framework.  When I tried VS2010, I ran into nothing but problems.  Sometimes it would act like the shift button was stuck down, and highlight text when I moved the arrow keys.  Often times I’d be editing an ASPX file, and the entire IDE would just spontaneously vanish.  It was noticeably slower, and some larger files would just grind the entire thing to a halt, forcing me to do edits in Notepad2 instead.  I never quite did figure out how to debug both .NET 4 server code and JavaScript code at the same time; I would have to detach the debugger from the web server and reattach it to Internet Explorer, wasting all sorts of time.  In the end, this product was a huge step backwards from the 2008 release, which simply worked.

However, years past and we’ve now moved on to the .NET 4.5 framework.  A lot of the open source projects I use now require at least .NET 4.0 to even compile.  I find myself really wanting to take advantage of some of the new language features, and plus when I start doing Windows Phone or Surface development, I’m going to have to upgrade anyway.  So I figured I would give Visual Studio 2012 a shot.

Wow, am I impressed with this release.

This IDE is beautiful, it works, it’s slick and it’s fast.  You can tell they really took the time to pay off some of their own technical debt.  It no longer seems clunky, it no longer crashes, and the debugger works flawlessly.  It has all sorts of great new features, like the ability to pin a pane to any monitor, better code searching, and improved JavaScript support.  I’ve also been taking advantage of new build features, which I’ll go into more detail in the next post.

Even more impressive was the fact that I could open my VS2008 Solution file and it was upgraded on the fly to target the .NET 4.5 framework, and everything compiled successfully on the very first try.  What I thought was going to be an evening of work turned out to be a few minutes.

I also took the time to install Windows 8 on my dev box, which I’m equally happy with.  It’s also a fast and stable environment that looks nice and is easy to use.  The weird new metro-style start menu is a bit weird, but it actually works fairly well when you’re using two monitors.  You’ll only get the start menu on one screen, and you can click on the second monitor for it to go away.  Your shortcuts in the launch bar are duplicated on both monitors, so you can use those to quickly launch applications if that’s what you’re used to.  I’m also getting fairly comfortable with the Win+Q key to find and launch programs, which works almost as well as Spotlight on OS/X.  I also really enjoy the panoramic themes, which are collections of beautiful images that span multiple monitors.

I think upgrading to Windows 8, .NET 4.5 and Visual Studio 2012 sets me up for a much better development environment for the KitchenPC web site, as well as mobile and tablet apps in the near future.  Needless to say, if there’s anyone reading from either the Windows 8 team or the Visual Studio 2012 team; fantastic work, and keep it up!

Static files? Not on my server! (Technical Debt Part 3/3)

1 Comment

During the last month, I’ve been working on paying off some of the technical debt that has been collecting interest over the course of KitchenPC’s development. Most of these have been stability changes, or just things that, when fixed, remove roadblocks to a smoother development process. I’d like to go over these in a three part blog series that, like the Hitchhikers Guide to the Galaxy trilogy, will consist of five parts.

Part 3/3: Static files? Not on my server!

After implementing my new session management system, one of the main problems was that I was stuck with really big cookies – somewhere over 700 bites, err bytes.  That’s a lot of data to be sent on each request, including requests to static script files, images, CSS files, fonts, and other web resources that have no need for the session data stored in each cookie.  Now, after I posted the last blog I ended up writing my own custom serialization code for the session data, which got the cookie size down from 700+ bytes to 77 bytes (well, depending on the size of your user name,) so that’s a huge win.  However, I still wanted to implement a system that would not transmit cookies over the wire on each request.  For this reason (and others,) I decided to move all my static files over to a CDN.

A CDN, according to Wikipedia, is a “large distributed system of servers deployed in multiple data centers in the Internet.”  More specifically, it’s a network of servers that can easily store static files for you so that you don’t have to house them on your own web servers, which are busy with dynamic content, database queries, finding out what you can do with 47 ounces of asparagus, etc.  This network is usually built using servers across dozens of locations.  There are two huge advantages of this geographical diversity.  First, if one node goes down due to a wind storm, bird strike, or someone tripping over a power cord (they told me they were going to tape down those cords!), then your content can be served up from elsewhere.  Second, requests can automatically be routed to data centers near-by, which will result in faster load times.  In other words, someone in Hong Kong visiting KitchenPC will download graphics and script files from servers near Hong Kong, and someone in Seattle visiting KitchenPC will download those same graphics and script files from servers near Seattle.  You get the idea.

Finding the Right CDN For You

There are tons of different CDN companies out there.  However, not all of them directly deal with end users.  For example, EdgeCast is a huge provider, but their minimum package is a thousand gigs of data for around $300 a month.  Way more than a small website needs.  For this reason, there are many smaller companies that simply buy up bandwidth at wholesale prices from the big guys, and resell it in smaller chunks to people like me.  The advantage of going through one of these companies is that you have no minimum to buy; your monthly bill can literally be like 88 cents.

Rackspace CloudFiles

The first CDN I checked out was Rackspace, whose solutions is called CloudFiles.  They’re built on Akamai CDN, and were desirable because I already host my web servers through Rackspace, and having less services billing me each month is always a good thing.  I have been using CloudFiles already for storing recipe images on.  During the beta version, if you uploaded either a profile picture or a recipe image, that file would go straight to CloudFiles and be served from there.  However, when I checked into CloudFiles for hosting my site’s static content on, I was hugely disappointed.

First, they don’t support origin-pulls.  An origin-pull, in CDN vernacular, is the ability to have the CDN download the file from your server the first time it’s requested, then cache it from that point on.  In other words, if someone requests http://cdn.kitchenpc.com/images/logo.png, and that particular node in the CDN doesn’t yet have that file, it will download the file from its origin, which is http://www.kitchenpc.com/images/logo.png.  This greatly simplifies things, as you don’t need to push out all your content ahead of time, and push it out again whenever things change.  Using Rackspace, I’d have to setup a new publishing process that copies all my files out to CloudFiles using their proprietary API (they can’t even be bothered to support something standard, such as FTP) – which means I have to either copy everything (very slow), or keep track of exactly what files were changed since I last published.  No thanks.

With a CDN supporting origin-pulls, I can simply deploy my changes out to my web server (which I already have scripts to do) and then call a single API which invalidates the CDN cache, causing the CDN to pull the modified files from my server once again.  For me, this is a huge win.

CloudFiles also doesn’t support folder hierarchies.  I find this completely ridiculous, especially considering how I have a “RecipeImages” folder with 60,000 files in it.  Their web interface basically chokes, and graphical front-ends to CloudFiles, such as CyberDuck, also are too slow to be usable.  I have various folders on my web server, such as /scripts, /styles, /images, etc to organize my content.  I’m absolutely not about to redo all my HTML to put every file in the same directory, especially since I use various plugins, such as TinyMCE, that require static resources to be laid out in specific locations.

CloudFiles actually allows you to hack around this limitation, as it supports the forward-slash character in an object name, but then you’d be writing some sort of publishing mechanism that flattens out your folder hierarchy, comes up with all the right object names, and deploys files using their proprietary REST API.  I had time for none of this.  Hopefully some day RackSpace will catch up with the rest of the world and provide a real CDN solution, but for now, they were out of the running.

GoGrid

Next, I decided to give GoGrid a try.  They came recommended to me, and were quite high up on a lot of the performance comparisons I looked at.  I mentioned earlier that EdgeCast has various resellers.  GoGrid is one of these.

GoGrid not only sells CDN services, but they provide complete cloud hosting (as well as dedicated hosting) as well.  Unfortunately, you have to sign up for their cloud hosting account before you can provision a CDN account, which must be done by emailing customer support and waiting an hour.  Signing up for this account proved difficult.  For starters, my credit card was denied due to a vague error.  After calling the bank, I was told the error was on GoGrid’s side.  Luckily, their customer support was incredibly helpful and stayed in chat with me for quite some time.  I was eventually told I should try a debit card, and I could switch over to a credit card once the account was setup.  However, at this time, their servers started misbehaving.  I would start typing in my billing information, and after about 10 seconds, the page would refresh and tell me the session had expired.  I tried on two different computers and three different web browsers, all with the same issue.  The customer support rep had no idea what the problem was.  Eventually, I connected to a remote server on RackSpace’s network and for some reason that computer worked.  It took me about two hours just to sign up for GoGrid.  The customer service rep was great, and even gave me a $100 credit on my account.

When I had the account setup, it didn’t take me too long to figure out how to setup the CDN.  Their interface is not the greatest, and changes (such as defining a CNAME) take a while to propagate, but it’s definitely usable.  Very quickly, I found a huge limitation in their design.

Basically, a CNAME (such as cdn.kitchenpc.com) can only point to a single root origin, such as http://www.kitchenpc.com/images.  I didn’t want to have to define cdn-images.kitchenpc.com, cdn-scripts.kitchenpc.com, etc.  Now, I can point cdn.kitchenpc.com to http://www.kitchenpc.com directly, however then any resource on my server can be pulled through the CDN; including dynamic content such as the home page.  Now, probably this doesn’t really matter that much.  In fact, it doesn’t at all.  However I’m just OCD about how my servers are setup, and decided I didn’t like it.  I did try emailing customer support, and they were completely ignorant as to what I was trying to do.  I decided it would take more time to explain it to them than it was worth, so I decided I’d had enough of GoGrid.

Amazon CloudFront

The last CDN I tried was Amazon CloudFront.  Amazon is, of course, a huge player in the startup world, hosting countless sites.  They have a huge CDN spanning the globe, and very competitive prices.  One nice thing about Amazon is I already do a lot of business with them and already have an account, so they already had all my billing information and everything.  Setting up a CloudFront account took about 30 seconds.

Their web interface is also superior and very easy to use.  Defining a CNAME was a snap, every change I made was instant, and everything just worked on the first try.  Amazon also has the same limitation; a CNAME can only point to a single root resource on the origin.  However, Amazon provides a feature called Behaviors.  A Behavior can define exactly what URLs are allowed and not allowed.  Using this feature, I was able to define behaviors to allow only requests to /scripts/*, /images/* and /styles/*.  Problem solved.

Amazon does have a bit of a downside.  Their pricing is quite convoluted compared to GoGrid, who just charges a flat rate per gigabyte.  Amazon has various prices depending on where you want data cached, how many times you call their APIs, how many times you invalidate the cache (though your first 1,000 are free), etc.  I came to the conclusion that though their pricing was not as straight forward, I just wasn’t dealing with enough data to care.  I expect my usage to be under $5/mon.

So…

So in the end, I ended up with Amazon CloudFront and am quite happy with them.  Modifying my site HTML was also quite easy.  Since I use XML templates for all my pages, I was able to add some code into the pre-processor to rewrite HTML such as:

<img cdn.src=”/images/logo.png” />

to:

<img src=”http://cdn.kitchenpc.com/images/logo.png&#8221; />

And then be able to define the CDN URL prefix within the web.config file.  This lets me run my site locally on my dev box without hitting the CDN, and use the CDN in the production environment.  I then had to change a bunch of HTML, however this was pretty easy with Visual Studio’s “Find and Replace” tools.  Within about an hour, I was completely up and running on Amazon CloudFront, with the site running quite nicely.

For startups, I’d definitely recommend using a CDN.  It’s very cheap (the bandwidth is probably cheaper than whatever you’re paying the server hosting company), provides fail-over and redundancy, takes stress off your web servers, and speeds up a lot of requests in other countries.  Also, since the domain name is different, you won’t be sending cookies over every HTTP request, which can speed up things as well.  Win-win.

Mmmm cookies (Technical Debt Part 2/3)

2 Comments

During the last month, I’ve been working on paying off some of the technical debt that has been collecting interest over the course of KitchenPC’s development.  Most of these have been stability changes, or just things that, when fixed, remove roadblocks to a smoother development process.  I’d like to go over these in a three part blog series that, like the Hitchhikers Guide to the Galaxy trilogy, will consist of five parts.

Part 2/3: Mmmm cookies

The authorization and session management code in KitchenPC is among the oldest and most cobweb infested in the entire code base.  The design is fairly straight-forward, and designed to be completely stateless so that any HTTP request could be handled by any server at any time.  Here’s how it works; I keep track of each user using three UUIDs in the database.  The first UUID is the user’s public ID.  This ID can be shared with anyone, and was actually used in several URLs on the old version of the site, which allowed users to subscribe to other users and see their public profile.  This is also the primary key for the Users table in the database.  The second UUID is called the Security ID.  This was also stored in the database, but never surfaced in any URL or HTML.  It wouldn’t be possible to see the Security ID of any other user besides yourself.  The third UUID is called the Session ID.  This UUID changes every time you log on to the site using the Logon form or the Facebook connect mechanism.

After the user is authenticated, all three of these UUIDs are joined into a single byte array and then Base64 encoded into a cookie.  This auth token is passed into any web service call as a way of representing the currently logged in user, and those three UUIDs are then de-serialized and validated against the database.  In order to hack the database, you’d need to know not only the user’s ID and security ID, but also their current valid session ID which changes on every logon.  It may not be industry grade security, but it’s fairly secure and works well.

The one problem with this approach is sessions must be maintained in the database.  To make matters worse, my design only supported a single session per user, as there was just one Session UUID column in the users table.  When users would logon with another browser, then that browser would have their currently valid session and the first browser would then show a “Session expired” error message.  This was sometimes not trapped, and weird “Unknown Error” messages would popup.  Most people just use a single computer at a time, so this wasn’t a huge deal.  However, the design just wasn’t going to cut it for mobile and tablet apps, when I want multiple users on multiple devices to use the same KitchenPC account.  A redesign was in order.

One approach would be to extend the current design.  I could create a sessions table, where a single user could have zero or more sessions.  Each time a user logged on, they would create a new session and have a valid cookie pointing to that session UUID.  I believe systems such as GMail work this way, as they allow you to list your currently active sessions.  The main drawback, in my opinion, was that it becomes messy to clean up after stale sessions.  Ultimately, you’d need some sort of cleanup process that ran every so often and deleted abandoned sessions.  You’d also have to keep timestamps on each session to keep it active, which would mean database updates on every web service call.  This could lead to scalability problems later on.

Instead, I took an approach modeled after the built in ASP.NET FormsAuthentication class, which runs as an HttpModule to validate each request.  Unlike my current implementation, the server does not keep track of valid sessions at all.  Everything needed to validate the authenticity of the session is stored within the cookie itself.  For example, a cookie might contain the string “Mike” as well as a UUID representing my user ID.  Upon each request, the server would say, “Yup that looks like a KitchenPC user account to me” and the request would be serviced under the credentials of that user account.

One major problem with this design.  What if the cookie is forged?  In other words, what if you changed the cookie to “Bob” and found out Bob’s user Id.

To prevent this, cookies are encrypted with a private key only known on the server.  It would now be difficult, if not impossible, to purposely change the cookie to represent another valid user.  Luckily, the .NET Framework has all the tools necessary to encrypt your cookies.  My new implementation uses a Triple DEA block cipher, which is a pretty good algorithm for this sort of encryption.  First, you’d want to serialize your session data to a byte array, which can be done manually or using the BinaryFormatter class.  We’ll call that byte array rawBytes.  Next, you’d encrypt the data to a new byte array:

TripleDES des = TripleDES.Create();
des.Key = key;
des.IV = iv;

MemoryStream encryptionStream = new MemoryStream();
CryptoStream encrypt = new CryptoStream(encryptionStream, des.CreateEncryptor(), CryptoStreamMode.Write);

encrypt.Write(rawBytes, 0, rawBytes.Length);
encrypt.FlushFinalBlock();
encrypt.Close();
byte[] encBytes = encryptionStream.ToArray();

In the code above, we create a new instance of the TripleDES class. The Key and IV properties are set to a byte array which represent your private key. These values need to be the same when you decrypt the cookie, so it would be wise to store them in a configuration file somewhere.

We then create a CryptoStream, and write the rawBytes array containing your unencrypted session information. We can then call encryptionStream.ToArray() to get the encrypted bytes out. You’d then serialize this out (perhaps as a Base64 string) to a cookie.

There’s one small problem with this. Though it would be nearly impossible to forge a valid cookie, one could still tamper with the cookie bytes and cause who knows what havoc. For example, if my encrypted bytes were [1, 2, 3, 4, 5] and I changed that to [1, 2, 3, 4, 6], this would still decrypt, just not to what I had originally encryted. This is because encryption algorithms don’t self-validate. It’s not like in the movies where if you crack the key, you get a big “Access Granted!” flashing message on the screen. This is actually good, as you don’t actually know when or if you’ve successfully decrypted the message; you just know that some bytes went in, some bytes came out.

However, we want to make our sessions tamper resistant. If someone changes even a single byte in the cookie, I want to just toss that cookie out and ignore it. To do this, you’d use a validation hash to store a hash of the unencrypted bytes. SHA-256 is a great cryptographic hash function for that purpose.

HMACSHA256 hmac = new HMACSHA256(valKey);
byte[] hash = hmac.ComputeHash(rawBytes);

valKey is an array of bytes, usually 64 bytes long, that would also be kept securely on your server. So long as you used the same valKey each time, the bytes you pass in to ComputeHash would hash the same each time.

So, now we’ve encrypted rawBytes to byte array called encBytes, as well as stored a hash of the original data stored in a byte array called hash. We’d now want to combine them into a single byte array as such:

byte[] ret = encBytes.Concat&lt;byte&gt;(hash).ToArray();

You now have a single byte array that you can serialize to a Base64 string and store in a cookie. Users would pretty much have no way to change the data without generating a new hash, which they could not do since they don’t have your hash key (valKey).

To decrypt the cookie, you’d do the same thing in reverse:

private static byte[] DecryptCookie(byte[] encBytes)
{
   TripleDES des = TripleDES.Create();
   des.Key = key;
   des.IV = iv;

   HMACSHA256 hmac = new HMACSHA256(valKey);
   int valSize = hmac.HashSize / 8;
   int msgLength = encBytes.Length - valSize;
   byte[] message = new byte[msgLength];
   byte[] valBytes = new byte[valSize];
   Buffer.BlockCopy(encBytes, 0, message, 0, msgLength);
   Buffer.BlockCopy(encBytes, msgLength, valBytes, 0, valSize);

   MemoryStream decryptionStreamBacking = new MemoryStream();
   CryptoStream decrypt = new CryptoStream(decryptionStreamBacking, des.CreateDecryptor(), CryptoStreamMode.Write);
   decrypt.Write(message, 0, msgLength);
   decrypt.FlushFinalBlock();
   decrypt.Close();

   byte[] decMessage = decryptionStreamBacking.ToArray();

   //Verify key matches
   byte[] hash = hmac.ComputeHash(decMessage);
   if (valBytes.SequenceEqual(hash))
   {
      return decMessage;
   }

   throw new SecurityException(&quot;Auth Cookie appears to have been tampered with!&quot;);
}

First, we split apart the byte array into its two components; the decrypted bytes and the 64 byte hash.

Once again we setup a TripleDES instance, only this time we decrypt the data to the stream. You’ll notice I use hmac.HashSize / 8 instead of hard coding in 64, since technically you can use any length key you want and I wanted my code to be flexible. I then have an array called message, which contains the TripleDES encrypted session data, and valBytes, which contain the validation bytes from the hash. I then decrypt message to a byte array called decMessage, and then compute the hash on that unencryted message. If it matches valBytes, then we know the key is legit.

Why Re-invent the Wheel?

Internally, this is very similar to what FormsAuthentication does.  So why did I create my own implementation?  First off, because I wanted to.  I actually enjoy implementing this sort of thing.  But mainly because I didn’t find the built-in support very flexible.  The ASP.NET mechanism only allows you to store a username as a string, and no other information on the user.  I wanted to be able to store the user’s UUID, as well as their name and perhaps a few other pieces of information.  I wanted to serialize this as my own datatype, allowing me to extract user data without having to hit the database for more information.  It also made the transition between my old code and the new code a lot easier, without the need to re-factor a lot of code.  The old cookie could be deserialized to a data type that was used in various methods, constructors, etc in the code base.  I wanted the new code to work in a similar way.

After doing quite a bit of testing with the new auth code, I must say I’m quite pleased.  Sessions last forever, and users don’t get random “Session Expired” popups when they logged on with another computer.  The code is also a lot faster.  I don’t need to check the authenticity of a cookie in the database each page load or web method call.  I can decrypt the data and validate the hash to make sure the information within the cookie is valid.  I can now be logged on to KitchenPC on all my computers and not worry about a thing!

The one drawback of this approach is these cookies are fairly long.  Mine is about 700 bytes or so.  This means another 700 bytes is passed in to each HTTP request, which can hurt performance.  This can be optimized by minimizing the data you’re storing in the encrypted package, and also using shorter (albeit less secure) keys and hashes.  I’m also thinking I could do my own serialization rather than using the BinaryFormatter class, which embeds things such as type data.

In the end, I implemented all this code as an IHttpModule which will run on each request, setting the value of HttpContext.Current.User if the cookie is valid, which is a fairly good approach.  You would then just add this class in the <httpModules> section of your web.config.  I was planning on sharing the entire code, but I’m afraid this post is just getting too long.  I’d be happy to share it with anyone who asks though!  Just let me know.  That’s it for now!

Optimizing the Application Loading Process (Technical Debt Part 1/3)

Leave a comment

During the last month, I’ve been working on paying off some of the technical debt that has been collecting interest over the course of KitchenPC’s development.  Most of these have been stability changes, or just things that, when fixed, remove roadblocks to a smoother development process.  I’d like to go over these in a three part blog series that, like the Hitchhikers Guide to the Galaxy trilogy, will consist of five parts.

Part 1/3: Optimizing the Application Loading Process

KitchenPC is implemented as a web application that runs within an IIS worker process.  Everything from “What Can I Make” to recipe searches to NLP queries is run within this context, and nothing much is handled out of proc.  There is one exception, email queuing, which runs as a Windows service.  However, pretty much everything else happens under IIS.  For this reason, there is a rather lengthy load time when the app initially boots up, as it has to load every ingredient and recipe from the database, compile the data, build in-memory graphs, build search trees, and initialize other data on the heap which makes the user experience lightning fast later on.

During the beta, this was pretty fast since I only had about 10,000 recipes and no NLP dictionaries.  App startup time was about 2-3 seconds.  However, I now possess a collection of recipes nearly six times larger, plus a ton of grammatical data to make natural language queries possible.  The application start time, even on a fast computer, is now somewhere around 10-15 seconds.

This presented two problems.  First, I’m impatient.  Working long hours on KitchenPC requires many, many rebuilds.  Each time I fire up the application in Visual Studio, I had to twiddle my thumbs while all the data was loaded into memory.  Often times, I was changing things that had nothing to do with this data, and didn’t even need to use these features.  Second, there were many bugs where if requests came in while the app was loading, there’d be random dictionary collisions and other problems that would cause exceptions to be thrown, and sometimes the app initialization to fail.  This was mostly due to poor code that didn’t lock objects and make them thread safe.  As site traffic picked up, this created a lot of friction deploying new changes into production.  My goal was to quickly be able to deploy changes, and not have a single HTTP request error out.

Multi-threaded Application Loading

My solution to this problem was to make the Application_Start code as simple as possible, and simply spawn a new thread to handle the heavy lifting.  So now, the application start looks something like this:

public class Global : System.Web.HttpApplication
{
   void Application_Start(object sender, EventArgs e)
   {
      Thread thread = new Thread(LoadData);
      thread.Start();
   }

   private void LoadData()
   {
      // I can haz data
   }
}

The first time a user goes to the site, the LoadData method is called on a new thread.  The user will immediately see the home page, even though KitchenPC is actually not fully initialized.  Of course, this means that certain functionality on the site will not be available for about 10-15 seconds, which could be a problem if the user immediately uses something like “Ingredients to Exclude” (which relies on NLP), selects a recipe within the search results (which relies on recipe aggregation code), or clicks on “What Can I Make?” (which relies on the modeling engine) – All of these features depend on data that’s loaded into static memory on initialization.

To solve this, I put write locks on this data as it’s being loaded.  The LoadData() method, cute as the lolcatz reference is, actually looks more like this:

private static ReaderWriterLockSlim _cacheLock = new ReaderWriterLockSlim();

public static void LoadData()
{
   _cacheLock.EnterWriteLock();
   try
   {
      // Load data from the database
   }
   finally
   {
      _cacheLock.ExitWriteLock();
   }
}

Then, any code that needs to reference any of this data enters a read lock:

public static string ReadData(Guid key)
{
   _cacheLock.EnterReadLock();
   try
   {
      // Lookup key in data and return value
   }
   finally
   {
      _cacheLock.ExitReadLock();
   }
}

From the user’s point of a view, if they were to use one of these features while the app was still loading, they would get a ten second pause or so until the initialization code finished and the data were fully available.  Remember, this would only be a problem for the first few seconds after the IIS app pool was reset (such as new binaries were sent out to the server, or a configuration change was made), which usually happens at night when traffic is low.

Another advantage of this approach is I can also reload data without reloading the entire app.  There’s also administrative web services that can call LoadData() again, re-acquiring that write lock, to refresh the data in memory.  This is done when we change NLP data, push new recipes into the site, or change certain ingredient metadata.  Basically, certain site features can temporarily be halted or taken offline as data is being refreshed.

This makes making minor site config changes or site updates much smoother, and everything is now done in a much more thread-safe manner without weird random exceptions cropping up all over the place.

Not all components support this!

One gotcha with this design is not all ASP.NET components support being run outside the local thread scope.  For example, anything that relies on HttpContext.Current having a value will throw a NullReferenceException when being run in a newly spawned thread, as HttpContext.Current is actually a dictionary keyed by a thread ID.  If you’re no longer running in the thread that started the HTTP request, this value will all of a sudden be null.

Castle ActiveRecord, which is the ORM that KitchenPC is built on top of, uses the HttpContext as a key to figure out which database connection and transaction to use.

Luckily, ActiveRecord ships with a class called HybridWebThreadScopeInfo which will solve this problem.  This IWebThreadScopeInfo implementation will detect if HttpContext.Current is null, in which case it will initialize a new database context just as if it were a new request.

If you take a look at the code, you’ll see:

HttpContext current = HttpContext.Current;

if (current == null)
{
   if (stack == null)
   {
      stack = new Stack();
   }

   return stack;
}

Where-as the normal WebThreadScopeInfo class will throw an exception if current is null.

You can easily configure this in web.config with one line:

<activerecord isWeb="true" threadinfotype="Castle.ActiveRecord.Framework.Scopes.HybridWebThreadScopeInfo, Castle.ActiveRecord.Web">

That’s all there is to it!

Hope you’ve enjoyed this little technical insight into the inner workings of KitchenPC, and hopefully some of these techniques will help you come up with a better design for your app.  Stay tuned for part 2, which will be equally technical and nerdy!

Older Entries

Follow

Get every new post delivered to your Inbox.