Crawl delay and the Bing crawler, MSNBot

Search engines, such as Bing, need to regularly crawl websites not only  to  index new content, but also to check for content changes and removed content. Bing offers webmasters the ability to slow down the crawl rate to accommodate web server load issues.

The use of such a setting is not always needed nor is it generally recommended, but it is available for use by webmasters should the need arise. Websites that are small (page-wise) and whose content is not regularly updated probably will never need to set crawl delay settings. They will likely receive no benefit from the use of a crawl delay, as the bot will automatically adjust its crawl rate to an appropriate level based on the content it finds with each pass.

Larger sites that have a great many pages of content may need to be crawled more deeply and/or more often so that their latest content may be added into the index.

Should you set a crawl delay?

Many factors affect the crawling of a site, including (but not limited to):

  • The total number of pages on a site (is the site small, large, or somewhere in-between?)
  • The size of the content (PDFs and Microsoft Office files are typically much larger than regular HTML files)
  • The freshness of the content (how often is content added/removed/changed?)
  • The number of allowed concurrent connections (a function of the web server infrastructure)
  • The bandwidth of the site (a function of the host’s service provider; the lower the bandwidth, the lower the server’s capacity to serve page requests)
  • How highly does the site rank (content judged as not relevant won’t be crawled as often as highly relevant content)

The rate at which a site is crawled is an amalgam of all of those factors and more. If a site is highly ranked and has a ton of pages, more of those pages will be indexed, which means it needs to be crawled more thoroughly (and that takes time). If the site’s content is regularly updated, it’ll be crawled more often to keep the index fresh, which better serves search customers (as well as the goals of the site’s webmasters).

As so many factors are involved in the crawl rate, there is no clear, generic answer as to whether you should set a crawl delay. And how long it takes to finish a crawl of a site is also based on the above factors. The bottom line is this: if webmasters want their content to be included in the index, it has to be crawled. There are only 86,400 seconds in a day (leap seconds excluded!), so any delay imposed upon the bot will only reduce the amount and the freshness of the content placed into the index on a daily basis.

That said, some webmasters, for technical reasons on their side, need a crawl delay option. As such, we want to explain how to do it, what your choices are for the settings, and the implications for doing so.

Delay crawling frequency in the robots.txt file

Bing supports the directives of the Robots Exclusion Protocol (REP) as listed in a site’s robots.txt file, which is stored at the root folder of a website. The robots.txt file is the only valid place to set a crawl-delay directive for MSNBot.

The robots.txt file can be configured to employ directives set for specific bots and/or a generic directive for all REP-compliant bots. Bing recommends that any crawl-delay directive be made in the generic directive section for all bots to minimize the chance of code mistakes that can affect how a site is indexed by a particular search engine.

Note that any crawl-delay directives set, like any REP directive, are applicable only on the web server instance hosting the robots.txt file.

How to set the crawl delay parameter

In the robots.txt file, within the generic user agent section, add the crawl-delay directive as shown in the example below:

User-agent: *
Crawl-delay: 1

Note: If you only want to change the crawl rate of MSNBot, you can create another section in your robots.txt file specifically for MSNBot to set this directive. However, specifying directives for individual user agents, in addition to using the generic set of directives, is not recommended. This is a common source of crawling errors as sections dedicated to specific user agent directives are often not updated with those in the generic section. An example of a section for MSNBot would look like this:

User-agent: msnbot
Crawl-delay: 1

The crawl-delay directive accepts only positive, whole numbers as values. Consider the value listed after the colon as a relative amount of throttling down you want to apply to MSNBot from its default crawl rate. The higher the value, the more throttled down the crawl rate will be.

Bing recommends using the lowest value possible, if you must use any delay, in order to keep the index as fresh as possible with your latest content. We recommend against using any value higher than 10, as that will severely affect the ability of the bot to effectively crawl your site for index freshness.

Think of the crawl delay settings in these terms:

Crawl-delay setting

Index refresh speed

No crawl delay set

Normal

1

Slow

5

Very slow

10

Extremely slow

Feedback

The Bing team is interested in your feedback on how the bot is working for your site, and if you decide a crawl delay is needed, which setting works best for you for getting your content indexed while not seeing an unreasonable impact upon the web server traffic. We want to hear from you so we can improve how the bot works in future development.

If you have any questions, comments, feedback, or suggestions about the MSNBot, feel free to post them in our Crawling/Indexing Discussion forum. There’s another SEM 101 post coming soon. Until then…

– Rick DeJarnette, Bing Webmaster Center

Join the conversation

44 comments
  1. Anonymous

    Sweet! Thanks ;-)

  2. Anonymous

    Thanks! Good info!

  3. jackin

    Really i like this new concept of crawl delay settings!

  4. zhangmin_iichiba

    All the search enginge support Crawl-delay?

  5. Quality Directory

    I use Awstats, which doesn't give me much information about the crawl rate of spiders. So, for now I can't slow down the crawl rate of any crawler, to reduce server load issues. Maybe, I will have to get a better log stat application.

  6. Anonymous

    Why is it, I can't find Bing's user agent anywhere?

    Its a bit odd, don't you think? :/

  7. Anonymous

    This can come in handy for companies that have really large websites that are constantly changing.

  8. miles2go

    Recently i used it fast. My more page got index fast in a day time.

  9. stylight

    Bing is so slow!

    Maybe it is impolite to talk about the competition in this blog, but I am extremely frustrated by bing.

    I submitted my site's sitmap to bing, Yahoo and Google about 6 weeks ago. As of today, Yahoo / Google already indexed 80 / 150 pages. And bing? Single page only!

    You wrote "The Bing team is interested in your feedback on how the bot is working for your site". So, you got my feedback. I will be happy to learn why bing is so slow.

    Cheers,

    Sty

  10. Blackpool UK

    some of my websites webpages gets indexed much faster on bing

  11. Anonymous

    Maybe bing's indexenig policy depends on geography?

  12. oceanwwind

    Is there a dashboard where we can find out how our keywords are ranking?

  13. ads online

    Good information for webmaster of blog/web

  14. Anonymous

    Bing is super slow at indexing its not even funny. FIX IT.

  15. Anonymous

    Everyone needs to know about the crawl delay settings, thanks!

  16. Free Online Games

    I need it crawling faster!

  17. sinister-septum

    thank u but how to make it more speed

  18. Anonymous

    bing is too much slow to crawl because before lunching bing my site was index and it was on 1st page of msn search engin but after lunch bing i never can see my site on bing anywhere because it is not crawled yet. its been 3 month

  19. Anonymous

    crawl settings are becoming very important for SEO professionals, you got to use your crawl  properly and intelligently.

  20. Anonymous

    its very good information.thanku very much. bing crawl site very late

  21. nmarketers

    I am trying hard to increase the crawl rate increase in my website

    http://nmarketers.com

  22. Webhostuk LTD

    This information really seems important, got to go and check the crawl rate on our sites.

    Thank you.

  23. Anonymous

    I like the crawl delay to set a new concept!

  24. freesoft4down.com

    Today i just realized that my indexed pages drop down from 243 to 141, Nothing was changed on my site! Should i setup crawl delay?

  25. francisduarte

    Great info :)

  26. webg

    I set up my account in bing webmaster central, also i uploaded sitemap and validate it, and make an authentication and everything needed, but i can't get my site indexed almost a moth!

  27. bazacuda

    is that depends on site or blog?

  28. elliott_miller

    Is it typical to not have more than my homepage indexed after 4 MONTHS!!??!

  29. Anonymous

    my does not get much traffic from bing it is about extremly less compared to google how can it be increased

  30. tvichit

    Good info.

    How to gets fast indexed on bing

  31. theacidman

    i think bing is actually a really good search engine i like it better that YAHOO!

  32. brightcreek

    Thanks and the info is great

  33. Anonymous

    Everyone needs to know about the crawl delay settings, thanks!

  34. Anonymous

    bing can't be using "msnbot" as useragent, it's not being recognized by my site!  And also, you're indexing my terms & conditions page instead of my main page, which should be bypassed by visitor with "msnbot" UA.  Not working!

  35. StuartMail

    Same issue as many other posters I need BING to crawl my site – http://www.xdox.co.uk – more often not less often.

    I have submitted my site via webmaster tools and direct to submit url, plus posting my sitemap via tools.

    Last crawl as Oct 3rd and then only for 1 page for my site despite posting my full sitemap back in September.

    My site is indexed by Google daily, and Yahoo at least weekly – why is BING so slow?

  36. abid

    its good information for me and my problem is solved

    thanks

  37. antoniowilder53

    I am not sure which factors determine Bings crawl rate but its seems to be slower than Google or even Yahoo.. http://www.hostelio.com

  38. krishna88

    Where would I get this tool?

  39. tweetymapcom

    Thank you for the crawl rate information.  I have noticed a huge spike in the crawling from msnbot in the last couple of days.  The dramatic increase from msnbot is causing slight delays on my dedicated server.

  40. gnabeel

    I will never need crawl delay with bing, because it crawls my sites after every 6-10 days.

  41. Rhyno

    I've found Bing will download about 600MB/day from my site while Google will download about 18MB/day.  Bing will take weeks to update index and not index as many items as Google.  Bing does not like to use the new URL for 301 redirects and insists on showing really old URLs "because they are indexed from links on other pages".  In other words, Bing believes possibly outdated, off-site links more than it believes the site's http directives.  It's as if Bing is collecting the information, but not really doing anything about it.  I think that is what frustrates so many people about the crawl rate.  There could be efficiencies by not re-downloading when the expires is 365 days – maybe Bing just needs to have some memory of expires, even if it doesn't cache the actual file?

  42. gordon.pryra

    Very few search engines let you speed up thier crawl rate, which can be annoying when its almost 4 weeks since it last worked.

    <a href="http://www.satinbow.co.uk">Wedding Dresses Hertfordshire</a>

  43. cliqadam

    Google does not support it  

    to control it in google you have to use their webmaster tools in site setup

Comments are closed.