Skip to content

We all want to be normal[ized]

June 26, 2010

As it turns out, the ability to accurately tally up the ingredients you need to buy at the store is a rather difficult problem to solve.  Upon looking through other recipe websites (which I’ve been doing a lot of recently), it became apparent that no one else out there has even really put forth much effort into solving this problem.  Here’s some sites that have made an effort:


Epicurious shopping list aggregation

Epicurious is a fantastic site, with the option (for signed on members anyway) to create shopping lists out of a recipe they like.  However, the site doesn’t even attempt to bother trying to combine anything.  The default view for your shopping list is grouped by the recipe, which will then just list all the ingredients and amounts underneath.  The user can also group by shopping aisle, however still no common ingredient aggregation takes place.  Even a seemingly easy conversion between cups and teaspoons results in duplicate ingredients being listed, as seen in the screen shot above.  I won’t even bother trying to see how they would combine incompatible forms, such as shredded cheddar cheese in cups, and cheddar cheese by weight.  The user is probably best off just jotting down ingredients on a piece of scratch paper and heading off to the store.  Someone must do better than this, right?


RecipeZaar's attempt at basic ingredient aggregation

RecipeZaar is one of my favorites.  They’re owned by Scripps Networks, who has successfully vampired countless hours of my life with their television networks such as HGTV, Food Network and The Cooking Channel.  You’d think these guys would be the masters of recipe websites.  It does seem, to me anyway, they made somewhat of an effort on their shopping list feature.  First off, they recognize when an ingredient is common among two recipes.  That’s a start!  However, once I start adding multiple recipes, I start to get very confused.

I started with a puff pancake recipe that calls for 4Tbl of butter, 3 eggs, 6 egg whites, 1 cup of milk, a cup of flour and a half teaspoon of salt.  Let’s add that to our shopping list.  Done.  Next, I dug up a second pancake recipe (I’m a man who likes his pancakes as some out there can attest to.)  This recipe calls for 1 egg, 3/4 cup of milk, 2 tablespoons of melted margarine, a cup of flour, a tablespoon of sugar, 3 tablespoons of baking powder, and another half teaspoon of salt.  Added, done!  What do we have?

Once again, the shopping list is organized into recipes.  There’s a column for which recipe the ingredient goes with, in my case the latter pancake recipe is “A” and the puff pancake recipe is “B”.  They get a few points for at least recognizing when a common ingredient exists within multiple recipes.  However, even on the simplest form of aggregation, they fail to deliver.  For example, milk appears in both recipes.  On my shopping list, the line for milk appears as “milk (3/4 cup + 1 cup)”  I could see this getting very confusing if I were to say, add 50 recipes to my list all with milk.  Another thing is that eggs appear for both recipes (I need to buy 1 + 3 eggs, thanks!) and “egg whites”, something you extract from an egg appears on its own line for recipe B.  In other words, something you do to an egg appears on a shopping list that is intended to describe products you should purchase at a store.  The same error happens with margarine, as the list instructs you to purchase 2 tablespoons of melted margarine.  Sorry, my local Safeway doesn’t sell melted margarine.  That’s what my microwave is for.

AllRecipes combines common flour, cheese, etc.

AllRecipes is the best of the best.  They’ve been around forever, have tens of millions of users, and are probably the most well known recipe destination on the Internet.  Surely they won’t disappoint.  I started out looking for recipes that use cheddar cheese.  The first one I came across is “Creamy Cheddar Cheese Soup”, mmm.  This recipe calls for 1/4 cup butter, an onion, 1/4 cup flour, 3 cups of chicken broth, 3 cups of milk and a pound of shredded cheddar cheese.  Sounds good so far!  I’ve added this to my shopping list and decided to see what else I can do with the 18 crates of cheddar cheese willed to me by my grandma Ester, who recently died in a freak dairy accident.  Cheddar cheese puffs it is!  This recipe stipulates a cup of shredded cheddar cheese, half a cup of flour, 1/4 cup of butter or margarine, and half a teaspoon of ground mustard.  Let’s toss that in to our list as well.

The results are very encouraging!  First, I notice the flour has been combined into a total of 3/4 cup!  This is progress.  Even more impressive is the fact that a pound of shredded cheese and a cup of shredded cheese were combined into 1-1/4 pounds of cheese.  AllRecipes seems to get it, but I wonder how they do this magic?

AllRecipes tacks user entered data to the bottom

One clue is when you add your own recipe to the site.  I added a “test recipe” that only I could see that calls for one cup of milk.  I saved the recipe, and added that recipe to my shopping list.  Here, we got no ingredient aggregation of any sort.  My ingredients were simply tacked on to the bottom, completely unsorted in any way.  This leads me to believe that AllRecipes doesn’t actually have any native normalization of ingredients or forms behind the scenes.  I’m willing to venture a guess that each individual “approved” recipe contains certain metadata which describe what ingredients one needs to purchase at the store.  These metadata can be combined with other recipe metadata to come up with totals.  Further evidence of this is when I manually add an individual ingredient to my shopping list.  Here, I can actually type in anything I want, including “tennis shoes” and it appears on my list.  In other words, I don’t have access to add my own ingredient and append it to the total of a previous aggregated amount.  I could go spend more time to really prove this hypothesis (for example, does a cup of shredded cheese always weigh a quarter of a pound in AllRecipes-land?), however I think I’ve seen enough to understand their basic design.  Which I will now tear appart in great detail.

Why this doesn’t work

These guys are trying, they really are.  Their customers obviously emailed is masses with “We want shopping lists!” requests.  They complied.  They cobbled together a shopping list feature that basically gave the user the ability to do what they could already do on paper, or on the magnetic whiteboard attached to the kitchen fridge.  The root of the problem is they were hacking this feature onto a database design that simply wasn’t built to generate accurate shopping lists.

When designing KitchenPC, this challenge was one that intrigued me and I was determined to provide accurate shopping lists for users, allow them to modify their list and add ingredients, and have it work with any recipe in the database.  The root of the problem is in ingredient organization.  For these sites like AllRecipes and Epicurious and RecipeZaar, a recipe is just a list of ingredients.  Some might just store a single string in the database that contains an ingredient and amount, such as “1 cup all-purpose flour.”  Each recipe would have one or more of these usages.  The problem is, another recipe might call for “2 cups all purpose flour.”  Is “all-purpose flour” the same as “all purpose flour?”  Sure it is!  The only difference is how the author typed it in.  However, to a computer it’s difficult to tell that these are in fact the same ingredient, and unless the strings match up perfectly, it may assume otherwise.  For this reason, ingredient normalization is the first step to the perfect shopping list feature.  We need to have a single “datum” in the database called “all-purpose flour” and have other recipes simply point to it.  This is extremely powerful in various ways, such as in providing the ability to look up all recipes that call for “flour” very easily, providing filtering abilities for ingredients the user wishes to avoid, etc.  I’ll go out on a limb and say that the three recipe sites above at least partially employ the concept of ingredient normalization in their database.  However, the second step to providing a working shopping list feature is to recognize the difference in “ingredient forms.”  When you purchase an ingredient at the store, you usually purchase it in a certain form (almost always sold by weight) and then “do something” to it to prepare it for use in your recipe.  You may chop and onion, or melt some cheese, or slice an apple.  The problem is that recipes often refer to an ingredient in its prepared form, with a unit that makes sense only in that context.  Your local grocery store will sell you cheese by the pound, however your recipe might call for a half cup of shredded cheese.  This is because the recipe assumes you’ll buy some cheese, shred it, and then fill a half cup measuring device.  To summarize, the accurate formation of a shopping list necessitates the ability to convert between the form one uses an ingredient in, and the form one potentially purchases that ingredient in.

The way KitchenPC does this is by managing a finite list of ingredients in their “shopping list” form.  This list is similar to an inventory you’d find at your local grocer, and it’s this database that is used for generating a shopping list.  KitchenPC next has a separate table of forms for these ingredients.  These forms describe a way this ingredient could potentially be used in a recipe.  As this is a one-to-many relationship, a single “shopping list” ingredient might contain several forms.  An example of a form of cheddar cheese would be “shredded”, and it would indicate this is a volumetric form and could be measured in cups, tablespoons, gallons, whatever.  There also exists “conversion metadata” which ensure any form of any ingredient can be converted to its parent shopping list form.  If a recipe calls for a cup of shredded cheddar cheese, the system is able to identify the relationship between this usage in a recipe and how cheese is sold in stores.  In other words, it knows how much a cup of shredded cheese cheese weighs and thus how much you need to purchase at the store to make this recipe.  It doesn’t matter if the ingredient was added manually, through the addition of a recipe to your shopping list, or through the meal calendar.  Furthermore, since ingredients are kept in a normalized manner, units can be abstracted from the ingredient usage and determined “on the fly” as appropriate.  If I combine 4 recipes that each call for 4 tablespoons of olive oil, my shopping list will deem that “cups” is a more appropriate unit for that and express my shopping list as “1 cup of olive oil.”  If I increase the serving size on a cookie recipe to serve my army of 5,000 warriors, the recipe will start expressing itself in pounds and gallons and stuff.

But there’s a dark side to this of course as well, and it comes at the cost of usability.  The site will only be as smart as the data it has to work with.  As I require all recipes to adhere to this schema, users must choose an ingredient and form from a pre-existing list.  Right now, I have thousands of ingredients and forms but it’s pretty safe to say I will have missed some.  Users will become very agitated with their inability to add their recipe.  I believe this problem will simply be solved in time.  Eventually every ingredient conceivable by man, woman and chef will be in this database, or at least enough to satisfy 99.9% of recipes.  It’s possible that I could eventually implement some sort of “backup UI” that would allow the user to type in the ingredient as pure text and human intervention would be involved on the backend to add that ingredient and/or form to the database for future use, fix up the links, and then “publish” the working recipe.

In addition, adding a single ingredient requires the user to first lookup the ingredient name, then choose a form (shredded, melted, chopped, etc), and then enter the amount and the unit.  Most recipe websites allow the user to simply type in the whole thing on one line of text.  This is a very difficult problem to solve, and prevents me (at least for now) from doing stuff like the bulk-import of thousands of recipes, and makes the recipe editor rather clunky.  I’m already thinking about some “natural language” techniques to parse a description of an ingredient usage and extract the amount, form, ingredient, and any preparation notes implied.

I tend not to worry about this as much as one might think.  The ability to accurately aggregate ingredient amounts is one crucial to my feature set, my revenue model, and my overall business strategy.  Provided I can have enough recipes in the database, I believe adding new recipes will be a relatively seldom used feature.  If it’s a little bit of a pain, so be it.  But like any beta product, it’ll be my attempt at “building a better mouse trap” and I’m sure I’ll make tweaks here and there depending on customer feedback and pain points.

If you’re thinking to yourself this is a long way to go just for shopping lists, as most people just do what you’re doing in their heads, then you may be right.  However, my broader business vision starts to really take advantage of this concept in ways never before possible.  But this will be for a later blog post.



From → Business, Technical

  1. Carrie permalink

    I do not envy you! :) After spending 3 years trying to corral users into using normalized data, and scrubbing legacy data that was not normalized, it never ceases to shock me how difficult even simple things turn out to be a huge challenge. Looking forward to hearing about the other pieces of your puzzle!

Trackbacks & Pingbacks

  1. All Around the World News
  2. Power vs Usability: The Never Ending War « KitchenPC

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: