Relevance, Relevance, Relevance!

In building 88 on Microsoft’s main campus in Redmond, WA there is a small, but growing group of us that think about relevance constantly. We eat, live and breathe ideas and technology that makes our relevance better. Having worked on this release for a little over 9 months we could not be more excited about the relevance of our new engine.  From our metrics, and more importantly our usage as customers, our new engine is so far superior to our old one.  Consider this post a little tour guide, if you will, of our new engine and the things you might notice when you use it.

Improved core relevance
Core relevance is a hard thing to quantify or qualify. I think of it as those real searches you do day in and day out.  When people do fancy demos they use searches like Britney Spears…which is an important search, but doesn’t really reflect some of the tough things the engine sees every day.

I was trying to think of a good example of this and just as I was writing, bam, a bunch of us decided to head to our favorite spot for lunch: the Microsoft taco truck.  I was kind of curious what would come up for Microsoft taco truck.  Sure enough, Live Search has good stuff.  The chowhound.com result in particular has some great commentary which talks about the “secret” location in the VFW parking lot. If you’re ever in the area, I highly recommend it.

This is core relevance.

Reduced spam
Spam is an arms race. A game of cat and mouse. We’re always going to be fighting people who threaten the integrity of our results by using illegitimate or malicious techniques. With this release of Live Search you should find the amount of spam is down quite considerably.

You might ask how we know spam is down?  Experts on our team take a “randomly selected and statistically significant” set of searches and measure the percentage of spam in the results. With this release that number is down in a non-trivial manner and we are excited about that.

Dramatically improved ”snippets”
One of the areas of search that does not get a lot of attention (but is incredibly important) are those little snippets of text or “summaries” for each result that describe the site you’re about to visit. Let me give you a flavor of some of our improvements in this area:

  • Muse Starlight: No more JavaScript issues
  • FBI: Notice we expand the acronym (to “Federal Bureau of Investigation”) and highlight it in the descriptions
  • Microsoft: Navigational links indented in the first result help you find what you are looking for quickly.

We’ll blog more about this topic soon. Stay tuned.

Much bigger index!
We are now searching 20 billion Web pages. This is 4 times the size of our previous index. Enough said.

Well, one more thing – we now have the infrastructure to easily add billions (yes, billions) more with relative ease.  This ensures we are always pushing the envelope with regard to the amount of human knowledge in our index.

Do what you mean, not what you say
Last, but not least, we want users to be able to search in the manner they feel most comfortable.  It’s our job to be doing the smart thing to figure out what people really mean.  Here’s an example search from a real user: nw coed soccer.  Previously we did not take into account that NW and Northwest are equivalents. Now we do. The result?  Better relevance.  See for yourself.

nw coed soccer, before and after

This post covered about some of the work we’ve done to improve our core web results.  Over the coming weeks we will talk these features in more detail. 

We’re very excited about the improvements and we hope you will be, too.  Thanks and please don’t hesitate to send us your feedback!