Back in February we launched our beta Bing Sitemap Plugin tool, which enabled anyone to install this on their servers, and have managed sitemaps in place. Versions are available for IIS and Apache servers, so a wide variety of businesses could benefit from this great feature. It would create sitemaps for you, and also create a delta sitemap, alerting Bing to your newest content since it last pinged us.
Having both comprehensive and delta Sitemaps provides you with significant benefits, as you will always have a full, up-to-date list of all URLs on your website that search engines can use for deep crawl, as well as a concise Sitemap of URLs that were modified recently, which search engine crawlers can prioritize. This can help in keeping bot traffic bandwidth down. In addition, the Sitemap Plugin automatically adds <lastmod> values to your Sitemap, and generates <priority> values to the Sitemap based on how popular your URLs are.
Now, after having been in beta for just over two months, and with great thanks to our beta program partners, we are announcing the official release of Bing XML Sitemap Plugin 1.0 today. The Bing XML Sitemap Plugin is an open source server-side technology that takes care of generating XML Sitemaps compliant with sitemaps.org for websites running on Internet Information Services (IIS) for Windows® Server as well as Apache HTTP Server.
What’s New in 1.0?
Between the Beta and 1.0 we made a number of improvements based on customer feedback. These include:
- Automatically ping the search engines after each Sitemap update (currently we ping Bing and Google)
- An improved computation algorithm for the <priority> value that is calculated – Multi-machine merge memory improvements and various other improvements
- Custom installation and data directory at installation time
Keeping with the Beta version, the Bing XML Sitemap Plugin 1.0 is available for IIS (6.0 or higher) as well as for Apache Server and is release as an open source project under the Apache License, Version 2.0.
Download the Plugin
You can download the Bing XML Sitemap Plugin 1.0 binaries as well as the source code directly from the Microsoft Download Center or from the Bing XML Sitemap download page in the Bing Webmaster Help Center.
Bing XML Sitemap Plugin Features Overview
Sitemaps.org Compliant XML Sitemaps
The Bing XML Sitemap Plugin generates XML Sitemaps compliant with http://sitemaps.org/. In doing so, it creates two types of Sitemaps:
1. A comprehensive Sitemap based on recent traffic activity
2. A “delta” Sitemap that contains all pages that changed within a configurable time window.
(When we speak of “Sitemap” we mean the “Sitemap Index” and the actual XML Sitemaps themselves.)
Configurable URL Parameter Handling
By default, query string parameters that are seen by the plugin are not added to the sitemap URLs automatically. However, if your site uses query parameters to uniquely identify content you can easily include each significant parameter to the configuration file called normalization.txt.
Link rel=Canonical Handling
If your pages use link rel=canonical, then the plugin will use the URL value from <link rel=”canonical”> when found in the page’s HTML source as the URL to add to the XML Sitemap. Note: parameters that are part of the canonical URL will not be dropped regardless of whether they are listed in normalization.txt)
How does the plugin handle redirects?
The plugin will only add pages that return HTTP status 200 to the Sitemap. However, when you redirect a page after it was 200 previously, it will still show up in the Sitemap for some time, that is, until the configured time decay has passed (this is configurable using the VisitTimeoutSec setting).
The plugin also generates a Sitemap file that only contains requests that resulted in a 404 on the site. The location of this file is currently commented out in the Sitemap index file because it should not be seen as a regular XML Sitemap, but it can be used by the search engine to inform the index that these URLs are no longer valid.
The Plugin Honors Robots.txt Rules
The XML Sitemap Plugin honors the robots.txt rules you define in your site’s robots.txt in the sense that it will not add blocked URLs to the XML Sitemap.
More flexibility of what gets added: Disallow.txt
For added flexibility you can also add specific disallow rules for the Bing XML Sitemap plugin, using the same syntax as robots.txt in the file disallow.txt. Any disallow rules there will be observed by the plugin in the sense that pages matching the disallowed URL patterns will not be added to the XML Sitemap.
Automatic <priority>: How it is calculated
Since the plugin works off of traffic-based signals, it uses these to help establish a priority value for a given URL in the XML sitemap. Based on the premise that 0.5 represents the average priority and that more important URLs need to get awarded a priority higher than 0.5, the currently used formula to calculate priority is as follows:
In other words: the priority of the page is the visit count for the URL divided by 2 times the average visit count for the entire host site.
Additional Configuration Options in Config.ini
The plugin allows for additional configuration post installation. Below are the different settings that you can influence.
This setting points to the path wwwroot of the site. This value should generally not be changed.
This setting points to the path of the content folder for the host. This value generally should not be changed.
Determines if the XML Sitemap is published to wwwroot (1) or not (0).
Determines if a sitemap: directive should be written to the host site’s robots.txt file.
Determines if the plugin should decompress compressed pages (1) or not (0). Default = 1. Since we need to decompress pages to calculate a signature, do not change this to 0 in production environments.
Determines if the plugin can decompress compressed pages. Default = 1
Determines if the search engines should be informed (1) or not (0) using a ping to their respective services. Currently the plugin can ping Google and Bing.
Determines if the plugin can write additional debugging information to the Sitemap that helps Bing improve the plugin (1) or not (0).
Maximum file size for the snapshot file used to generate the XML Sitemaps. Default is 4096MB.
Maximum amount of memory the plugin is allowed to use. Default is 32MB.
Number of hours between Sitemap generations. Default is the recommended value of 24 (once per day).
Minimum amount of free disk space to keep. When disk space is smaller than the configured value, the Sitemap Plugin cleans up its temporary data.
Maximum amount of memory used by the merge service to merge the Sitemaps in a multi-server scenario (default is 512MB)
Longest period of time in seconds a URL remains in the Sitemap without being seeing any traffic. Default is 2592000 which equates 30 days.
Examples of XML Sitemaps Generated by the Bing XML Sitemap Plugin
Our partners at Microsoft use the Bing Sitemap technology at Microsoft, too. Two of our largest properties, MSDN and TechNet, which both contain millions of pages, have been using the technology to generate their Sitemaps since Beta, side-by-side with their prior systems. (Note: You can see a working sample at http://msdn.microsoft.com/sitemaps2/trafficbasedsspsitemap.xml). These revisions, and this tool overall, should provide websites with a much easier way to build and maintain trustworthy sitemaps that are manageable across all scales.
…and since you’ve made it this far, why not check out the Bing XML Sitemap Plugin Downloads page now!