2 Things You Must Know about the Bing XML Sitemap Plugin

On the 23rd of May this year we released Bing XML Sitemap Plugin 1.0: the open-source server plugin for IIS and Apache that helps websites generate XML Sitemaps compliant with sitemaps.org. The Bing XML Sitemap Plugin — which you can download from here— is unique among its peers in that it:

  • Generates comprehensive Sitemaps based on incoming traffic
  • Generates “delta” Sitemaps based on signature computation of the page
  • Assigns <priority>values based on page popularity within the website, and
  • Assigns a <lastmod> date based on a refined signature computation that checks whether or not the page has changed

These are all great features that have compelled many webmasters and site owners to download and start using the plugin for all their Sitemap needs. However, today, I want to zoom in a bit more on two questions that people ask me about the plugin:

  1. “What’s the big deal with these “delta” Sitemaps?” and
  2. “Doesn’t all of this good stuff affect my server’s performance?”

So, let’s talks a bit about more about these two topics:

What are “Delta” Sitemaps?

In addition to collecting your site’s URLs to build a comprehensive Sitemap (or to be exact: a Sitemap Index and one or more XML Sitemaps), the Bing XML Sitemap Plugin generates Sitemaps that contain only those URLs that have changed from the last time they were visited. It is especially this “delta” information that provides additional, long-term benefits to web publishers and their performance in the Bing index (and as a result in Yahoo!, Facebook web search, Windows Smart Search, Siri on iOS7, and all the other cool places powered by Bing!).

Using the built-in signature computation, we are easily able to establish whether or not a page has changed significantly. By publishing that change information into the delta Sitemap we then are able to tell our Bingbot crawlers whether or not a page requires a re-crawl to pick up the changes in our index. This helps reduce bandwidth on both your server as well as on the Bing side, since Bingbot no longer needs to fetch a URL only to discover it hasn’t changed.  In fact, in some cases the impact of the Bing XML Sitemap delta technology is so significant that we see up to a 75% reduction in the need to crawl pages of a site.

Case Study: MSDN

Let’s look for example at one of our own Microsoft: MSDN.

MSDN is a site with millions of pages accumulated over several years and as a result, we spent a lot of crawl bandwidth just to ensure we had the latest/greatest pages in our index. By using the Bing XML Sitemap plugin technology as an integral part of their site architecture, MSDN was able to achieve a 60% reduction of overall Bingbot crawl traffic in the space of only a couple of weeks:

Chart showing reduction in crawl by XML plugin

As you can see, not only did the Bing XML Sitemap reduce crawl rate (crawled), it also helped align the crawl demand (queued) and the actual fetches as the reduction dramatically improved our so called “politeness constraints” for the site. The politeness constraints determine how many times we are allowed to crawl the site based on heuristics such as server throughput and responsiveness as well as webmaster inputs such as crawl control settings.

Performance

Obviously we designed the Bing XML Sitemap Plugin with performance in mind. In keeping with the MSDN example, let’s look at how the Plugin impacts CPU and memory.

CPU Usage

Here is the CPU usage for two servers: one with and without the Sitemap plugin enabled, respectively – both snapped at the same time under similar loads:

CPU performance with the Bing XML Sitemap Plugin

Figure 1: CPU load with the Bing XML Sitemap Plugin

CPU performance without the Bing XML Sitemap Plugin

Figure 2: CPU load without the Bing XML Sitemap Plugin

Conclusion: there is no noticeable difference in CPU consumption between the servers with and without the plugin enabled.

What about server memory?

Let’s look at the memory signatures for servers with and without the Bing XML Sitemap Plugin. The total memory per machine for these MSDN machines is 48GB:

Memory consumption with the Bing XML Sitemap Plugin

Figure 3: Memory consumption with the plugin

 

And here’s the chart showing memory without the plugin under similar load:

Memory consumption without the Bing XML Sitemap Plugin

Figure 4: memory consumption without the plugin

Again, looking at these charts, there clearly is very little difference in memory consumption when using the Bing XML Sitemap Plugin or not.

Conclusion

With its unique features of comprehensive and delta sitemaps that provide long-term bandwidth-saving and freshness benefits at little to no operational overhead, the Bing XML Sitemap plugin is a great addition to any webmaster’s tool belt. Available as binaries for IIS and Apache, the plugin is a great solution for your Sitemap needs across platforms. What’s more, the source code released under the Apache 2.0 open source license and you can therefore roll your own binaries or – if so inclined – see line by line how the Bing XML Sitemap plugin does its “Sitemap magic” under the hood.

Like what you’re seeing? You can read more about the plugin in our Bing XML Sitemap Plugin help topic or just head on over to download the binaries and the source code from our Microsoft Download Center.

Vincent Wehren – Senior Program Manager Bing Webmaster Tools