Since we posted about this topic a long, long time ago (especially if we’re counting in Internet years), it’s time for a refresher on how to verify that a bot visiting your site claiming to be a Bing crawler is really coming from Bing. If you see what appears to be Bingbot traffic in your server logs based on a user agent string, for example Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm), and you want to know if this traffic really is originating from a Bing server, you can take the following steps:
- Perform a reverse DNS lookup using the IP address from the logs to verify that it resolves to a name that end with search.msn.com
- Do a forward DNS lookup using the name from step 1 to confirm that it resolves back to the same IP address
We will look at how to do this in Windows and Linux, but let’s start from within your favorite browser using a web-based tool:
Web-Based Reverse DNS & IP Lookup Tools
Instead of using command-line tools supplied by your operating system, you can use one of the many web-based reverse DNS lookup tools. Here’s an example of a reverse DNS lookup for IP address 18.104.22.168 using http://www.whois.net/reverse-dns-ip-lookup/:
As you can see, the IP address we entered resolved to a name ending in search.msn.com. So far, so good! But now let’s confirm with a corresponding forward lookup:
And indeed: the forward lookup confirms that the msnbot-157-55-33-18.search.msn.com matches the original IP address we entered: 22.214.171.124.
Next, let’s look at some other methods that do not require a browser:
Reverse & Forward DNS Lookup in Windows
On Windows systems, you can use nslookup from the command prompt (using cmd.exe) for the both the reverse and forward DNS lookup. Here is the example of the reverse lookup for IP address 126.96.36.199; look for the line starting with Name: in the output. This is where you want to find the name that ends with search.msn.com:
Now do the forward DNS lookup; the output in the second Address: line should match the IP address you entered before to confirm it’s a Bing crawler:
The verdict is the same: the name and the address match, so this is a verified Bing crawler.
Reverse & Forward DNS Lookup on Linux-based systems
On Linux you can use the host command to do the same:
> host 188.8.131.52 184.108.40.206.in-addr.arpa domain name pointer
> host msnbot-157-55-33-18.search.msn.com
msnbot-157-55-33-18.search.msn.com has address 220.127.116.11
Beware: Don’t Use Hardcoded IP Addresses or Address Ranges
So, by using the reverse/forward DNS lookup method you can easily verify that an IP address is coming from Bing. It is important to note that like other search engines, Bing does not publish a list of IP addresses or ranges from which we crawl the Internet. The reason is simple: the IP addresses or ranges we use can change any time, so responding to requests differently based on a hardcoded list is not a recommended approach and may cause problems down the line.
For example, if you are experiencing an increase in HTTP 403 Forbidden responses to valid Bingbot requests, your webserver may have been configured to allow Bingbot access based on such a list. As a result, new Bingbot crawl machines (with new IP addresses) may unintentionally be denied access to your pages.
To that point, make sure to regularly monitor your 400-499 crawl errors in the Crawl Information tool from the Reports & Data section in Bing Webmaster Tools and check for spiking 403’s. Always keep an eye on your Message center or your inbox (provided you set up a forwarding address and alert preferences in your Bing Webmaster Tools profile).
Also, I would suggest reading Frédéric Dubut’s excellent post on crawl, robots.txt, crawl-delay, and crawl control and making sure that your site is all set for efficient crawl by Bingbot.
Have any alternatives to the nslookup or host commands you’d like to share? Leave me a note in the comments!
– Vincent Wehren, Bing Webmaster Tools Program Management