Search Results Clustering

As you may have noticed elsewhere, our teammates at MSR Asia released a Search Result Clustering site & toolbar (good job guys!). It can be used for query disambiguation (example: jaguar) and sub-topic discovery (example: data mining). It was developed at Web Search and Mining Group in MSR, Asia and does all of the clustering on the fly using MSN’s Search Results.

MSRA’s approach to clustering is a little different that other systems you might have seen.  Here’s a summary from the project’s publication:

Traditional clustering techniques don’t work for this problem because the documents are short, the cluster names should be readable and the algorithm should be efficient for on-the-fly calculation. The method takes on the whole problem in a different way and overcomes the difficulties in traditional clustering methods. It tries to first identify salient topics by identifying distinct and independent keywords, and then classifies the search results into these topics.

 If you want to learn more check out the associated research paper, Learning to Cluster Web Search Results.

Brady Forrest, MSN Search PM