A little over a week ago, we shared an all new, AI-powered Bing search engine, Edge web browser, and integrated Chat, that we think of as Your Copilot for the Web. It is designed to deliver better search results, more complete answers to your questions, a new chat experience to better discover and refine your search, and the ability to generate content to spark your creativity.
Since we made this available in limited preview, we have been testing with a select set of people in over 169 countries to get real-world feedback to learn, improve, and make this product what we know it can be – which is not a replacement or substitute for the search engine, rather a tool to better understand and make sense of the world.
Here is what we have learned in the first seven days of testing:
First, we have seen increased engagement across traditional search results and with the new features like summarized answers, the new chat experience, and the content creation tools. In particular, feedback on the answers generated by the new Bing has been mostly positive with 71% of you giving the AI-powered answers a “thumbs up.” We’re seeing a healthy engagement on the chat feature with multiple questions asked during a session to discover new information.
Next, we have received good feedback on how to improve. This is expected, as we are grounded in the reality that we need to learn from the real world while we maintain safety and trust. The only way to improve a product like this, where the user experience is so much different than anything anyone has seen before, is to have people like you using the product and doing exactly what you all are doing. We know we must build this in the open with the community; this can’t be done solely in the lab. Your feedback about what you're finding valuable and what you aren't, and what your preferences are for how the product should behave, are so critical at this nascent stage of development.
We would categorize our learnings as follows:
Better Search and Answers. You are giving good marks on the citations and references that underly the answers in Bing. It makes it easier to fact check and it provides a nice starting point to discover more. On the other hand, we are finding our share of challenges with answers that need very timely data like live sports scores. For queries where you are looking for a more direct and factual answers such as numbers from financial reports, we’re planning to 4x increase the grounding data we send to the model. Lastly, we’re considering adding a toggle that gives you more control on the precision vs creativity of the answer to tailor to your query.
Chat. The ease of use and approachability of chat has been an early success. Through your active use, we feel good about the discoverability and design to make it easy to access. There is also a lot of engagement which is delivering value for improving search and answers. One area where we are learning a new use-case for chat is how people are using it as a tool for more general discovery of the world, and for social entertainment. This is a great example of where new technology is finding product-market-fit for something we didn’t fully envision.
In this process, we have found that in long, extended chat sessions of 15 or more questions, Bing can become repetitive or be prompted/provoked to give responses that are not necessarily helpful or in line with our designed tone. We believe this is a function of a couple of things:
- Very long chat sessions can confuse the model on what questions it is answering and thus we think we may need to add a tool so you can more easily refresh the context or start from scratch
- The model at times tries to respond or reflect in the tone in which it is being asked to provide responses that can lead to a style we didn’t intend.This is a non-trivial scenario that requires a lot of prompting so most of you won’t run into it, but we are looking at how to give you more fine-tuned control.
We want to thank those of you that are trying a wide variety of use cases of the new chat experience and really testing the capabilities and limits of the service – there have been a few 2 hour chat sessions for example! - as well as writing and blogging about your experience as it helps us improve the product for everyone.
General fit and finish. Some of you have encountered and reported technical issues or bugs with the new Bing, such as slow loading, broken links, or incorrect formatting. Many of these issues have been addressed with our daily releases and even more will be addressed with our larger releases each week.
New feature requests. Some of you have requested more features and capabilities for the new Bing, such as booking flights or sending email. You’d also like to share great searches/answers. We love your creative ideas and are capturing these for potential inclusion in future releases.
We are thankful for all the feedback you are providing. We are committed to daily improvement and giving you the absolute best search/answer/chat/create experience possible. We intend to provide regular updates on the changes and progress we are making. Please keep the feedback coming.