Have you ever tried to remember the title of paper you read in an academic journal? You know the author’s name, but not how to spell it? Or, you’re sure that a certain actor starred in a movie directed by a particular director, but can’t remember the title? Wouldn’t it be great if your favorite search engine could help with that? Now it can.
One of the major challenges in search is to be able to help users express their intent in the form of a query that will find the correct information the user is looking for. Previously Bing shipped autocomplete, which helps complete queries, and PageZero
, which provides instant answers as you type. Now the Bing team has shipped a couple of features that build on that technology and experience.
Through the integration of technology built by TNR (Microsoft’s Technology and Research team) and the Bing semantic graph, these features allow the user to construct highly structured queries exposing Bing’s deep knowledge of specific topic areas.
It's All Academic
The first subject is academic suggestions, which we shipped earlier this year and was built jointly by the Cognitive Services
and Academic Search
teams. The feature allows users to explore the relationships between papers, authors, topics and publications through a large object graph. There are many scenarios that the user can explore, for example:
• Find all papers by an author
• Find a paper written by particular co-authors
• Find a paper about a specific topic presented at a conference
• Suggest titles or authors
We have built on that foundation to provide more intelligent autocomplete functionality. The graph relationships are explored in real-time and the most relevant suggestions are generated for the user, even if Bing has never seen the query before.
The second enhnacement allows users to find movies much more easily. If you were looking for that movie from 1982 directed by Steven Spielberg starring Drew Barrymore, it would be nice if your search engine could help you formulate the query. Now with this feature that’s exactly what happens:
Like the academic suggestions described above, the feature allows users to formulate natural language queries about the domain through autocomplete. Here are some of the kinds of queries that the user can formulate:
• Movies by director
• Movies starring an actor in a particular genre
• Movies from a particular year starring a certain actor
• Movies starring a pair of actors
How It Works
The features are made possible by technologies that allow for extremely efficient representations of semantic graphs as well as a lightning-fast runtime component that evaluates the user’s natural language input. The runtime component analyses the user’s input, determines the intent, extracts any recognized intents, and then generates the most likely interpretations. Uniquely, this system can generate extensions to the query even if no user has ever typed them in before, allowing additional, never-seen-before suggestions to be generated. Traditional autocomplete generally depends on having seen users issue the queries before.
For both academic searches and movies, the understanding of the underlying domain is represented by a graph. The data is derived from the semantic graph that Bing uses to understand the world. This is stored in a format that allows us to look up information at runtime within milliseconds, thousands of times per second as the user types. This graph store allows us to look up exact matches, such as ‘tom cruise.’ It is also able to support the notion that both ‘tom c’ and ‘tom connor cruise’ refer to the same person.
The final part of the system is the really interesting part. This is where we work out the meaning of the user’s input, and what the best possible completions of that potentially partial intent are. Let’s illustrate this with an example.
1) Imagine a user Jane is looking for machine learning papers by Andrew Ng presented at NIPS. As the user starts typing, Bing’s existing completions technology does a good job of generating suggestions and even showing that we understand what machine learning is.
Even at this stage the new system knows that it could start offering suggestions, but for now it detects that the intent is still quite ambiguous, so it doesn’t trigger yet.
2) As Jane continues typing, the new understanding system kicks in. The system recognizes the intent, and starts evaluating the graph for the next best possible completions.
At this stage it starts exploring potential paths, and starts generating completions.
3) The more Jane types the more constrained the evaluations of the possible completions become, and Jane is able to select the correct suggestion to issue her query.
The compelling detail about this process is that as we are generating the candidates for suggestions or interpreting the user input, we don’t just work with simple string representations. Instead we developed as set of rich objects capturing extremely detailed semantic interpretation of the query, its intent domain information, parts-of-speech mapping, and more.
For the query that was constructed for this example, “machine learning papers by andrew ng in nips” we have the following information representation:
As you can see this allows us to produce truly relevant results on the SERP, related not just to textual matches from the query, but also to a deeper semantic understanding of the users’ intent. An additional benefit is that, since Bing fully understands the query that is constructed, we know that results will be returned. This avoids potentially returning ‘dead-end queries’ with no results, grammatically incorrect or misspelled queries, which can be the case with more generalized language model based synthetic queries.
This is an interesting example of how different technologies developed by Bing and TNR can be brought together in new ways to add value for our users. Please explore the feature and send us feedback. Let us know what you think via Bing Listens so we can continue to improve our product.
-The Bing Team