How to Verify that Bingbot is Bingbot

Since we posted about this topic a long, long time ago (especially if we’re counting in Internet years), it’s time for a refresher on how to verify that a bot visiting your site claiming to be a Bing crawler is really coming from Bing. If you see what appears to be Bingbot traffic in your server logs based on a user agent string, for example Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm), and you want to know if this traffic really is originating from a Bing server, you can take the following steps:

  1. Perform a reverse DNS lookup using the IP address from the logs to verify that it resolves to a name that end with search.msn.com
  2. Do a forward DNS lookup using the name from step 1 to confirm that it resolves back to the same IP address

We will look at how to do this in Windows and Linux, but let’s start from within your favorite browser using a web-based tool:

Web-Based Reverse DNS & IP Lookup Tools

Instead of using command-line tools supplied by your operating system, you can use one of the many web-based reverse DNS lookup tools. Here’s an example of a reverse DNS lookup for IP address 157.55.33.18 using http://www.whois.net/reverse-dns-ip-lookup/:

 

image

As you can see, the IP address we entered resolved to a name ending in search.msn.com. So far, so good! But now let’s confirm with a corresponding forward lookup:

image

And indeed: the forward lookup confirms that the msnbot-157-55-33-18.search.msn.com matches the original IP address we entered: 157.55.33.18.

Next, let’s look at some other methods that do not require a browser:

Reverse & Forward DNS Lookup in Windows

On Windows systems, you can use nslookup from the command prompt (using cmd.exe) for the both the reverse and forward DNS lookup. Here is the example of the reverse lookup for IP address 157.55.33.18; look for the line starting with Name: in the output. This is where you want to find the name that ends with search.msn.com:

C:Users>nslookup 157.55.33.18
Server:  Unknown
Address: 

Name:    msnbot-157-55-33-18.search.msn.com
Address:  157.55.33.18

Now do the forward DNS lookup; the output in the second Address: line should match the IP address you entered before to confirm it’s a Bing crawler:

C:Users>nslookup msnbot-157-55-33-18.search.msn.com
Server:  UnKnown
Address:

Non-authoritative answer:
Name:    msnbot-157-55-33-18.search.msn.com
Address:  157.55.33.18

The verdict is the same: the name and the address match, so this is a verified Bing crawler.

Reverse & Forward DNS Lookup on Linux-based systems

On Linux you can use the host command to do the same:

> host 157.55.33.18 157.55.33.18.in-addr.arpa domain name pointer
msnbot-157-55-33-18.search.msn.com
> host msnbot-157-55-33-18.search.msn.com
msnbot-157-55-33-18.search.msn.com has address 157.55.33.18

 

Beware: Don’t Use Hardcoded IP Addresses or Address Ranges

So, by using the reverse/forward DNS lookup method you can easily verify that an IP address is coming from Bing. It is important to note that like other search engines, Bing does not publish a list of IP addresses or ranges from which we crawl the Internet. The reason is simple: the IP addresses or ranges we use can change any time, so responding to requests differently based on a hardcoded list is not a recommended approach and may cause problems down the line.

For example, if you are experiencing an increase in HTTP 403 Forbidden responses to valid Bingbot requests, your webserver may have been configured to allow Bingbot access based on such a list. As a result, new Bingbot crawl machines (with new IP addresses) may unintentionally be denied access to your pages. 

To that point, make sure to regularly monitor your 400-499 crawl errors in the Crawl Information tool from the Reports & Data section in Bing Webmaster Tools and check for spiking 403’s. Always keep an eye on your Message center or your inbox (provided you set up a forwarding address and alert preferences in your Bing Webmaster Tools profile).

Also, I would suggest reading Frédéric Dubut’s excellent post on crawl, robots.txt, crawl-delay, and crawl control and making sure that your site is all set for efficient crawl by Bingbot.

Have any alternatives to the nslookup or host commands you’d like to share? Leave me a note in the comments!

- Vincent Wehren, Bing Webmaster Tools Program Management

Join the conversation

2 comments
  1. Uwe Keim

    Again, no images in Google Reader and first click redirects to bing.com start page. Last time posting this, sorry for disturbing. I'm unsubscribing now.

  2. LarkB

    And what to do about an incredible onslaught of genuine Bing bots crawling our site at the same time?  We have set maximum crawls to 2 from 6am to midnight EDT with more leeway after midnight in an attempt to slow them down.  After doing so, we counted a minimum of 17 Bing bots crawling all day long, at one point destabilizing the site to the point it wouldn't load, returning sever connection errors in browsers.  There is an obvious problem somewhere since the bots won't obey crawl directives and we are almost to the point of blocking them completely in .htaccess.

Comments are closed.