Bing Webmaster Blog - Posts tagged with 'Index and crawling' - Page 2

Webmaster Center blog comments Q&A, Round 3

The Bing Webmaster Center team has been very busy lately, working on very cool stuff that we can’t wait to share with you (patience, Grasshopper – all will be revealed in time). But the blog waits for no one (well, that’s the intent, anyway). From time to time, we gather up enough interesting tidbits of Q&A that we want to share with all of our blog readers. Now it’s that time again. So let’s get to it. Q: I...
Read More

The liability of loathsome, link-level web spam (SEM 101)

When I was a kid in high school, I used to go to the public library and do initial research in the Encyclopedia Britannica (yes, the bound book editions. I also remember black & white television with vacuum tubes and rotary telephones! Sheesh, I’m getting old!). I would pick up the index volume that contained the keyword I wanted to look up to identify which of the main volumes had the content I sought. But imagine this: when I opened...
Read More

The pernicious perfidy of page-level web spam (SEM 101)

In the exciting world of today’s Internet, where the world’s information is literally at your fingertips, where you can endlessly communicate, shop, research, and be entertained, spam is a big downer. The unwanted email spam that fills our inboxes also consumes huge portions of the available bandwidth of our routers and trunk lines. But email is not the only spam game in town. Web spam is the bane (well, one of the banes) of the...
Read More

Eggs, bacon, spam, spam, and spam (SEM 101)

What is spam? One could argue that spam is a multi-faceted thing. The word itself has many definitions. For example, it can be defined as a processed spiced ham and pork slathered with a gelatinous glaze food product found in a tin (it’s apparently very popular in Hawai’i, don’t you know?). However, spam is also often used to reference a very popular comedy sketch written and performed on Monty Python’s Flying Circus (it...
Read More

Webmaster Center blog comments Q&A, Round 2

We still get many questions in our blog comments, even though we try to encourage our readers to post their questions to our Webmaster Center forums (which are actually staffed to answer your questions!). I do look through the blog comments every day and delete those that are junk (those that are empty, duplicated, offensive, and overtly spammy – see our Q&A reply on why blog comments are deleted in the 1st Webmaster Center blog Q&A...
Read More

Robots speaking many languages

We’ve already covered in past blog articles some of the basics about how webmasters can use a file called robots.txt to control how search engine crawlers (aka bots) crawl their websites. But there is so much more to talk about with bots. So let’s take a bit of a deeper dive into the subject. Topic 1: Using the proper text file encoding The robots.txt file is used by webmasters to either specifically define which files and directories...
Read More

MSNBot 1.1 is retired

The Bing team has been talking about its new crawler (aka bot), MSNBot 2.0b, in this blog for quite some time now. We have made numerous improvements in its performance, addressed some webmaster concerns, and published detailed information on how to control the bot with a robots.txt file. Today we are announcing that the new bot is fully operational. This development will enable Bing to do a better job at gathering the information we need from the...
Read More

Webmaster Center blog Q&A

We’ve been really busy here at the Bing Webmaster Center blog team, pumping out new content on a regular basis to create a nice library of content on issues that matter to webmasters and online publishers. I thought I’d take a moment to catch my breath, pause on creating a new thematic article (or yet another multi-part series!) for SEM 101, and address some commonly asked questions in the blog comments. Q: Why wasn’t my...
Read More

Prevent a bot from getting “lost in space” (SEM 101)

We recently published a non-SEM 101 blog post on controlling the crawl rate of MSNBot, the Bing web crawler (aka robot, or simply just bot). That got me thinking about robots. Naturally, that led to The Robot on Lost in Space. Will Robinson, the show’s precocious youngster who was a whiz at 1960s-style, clunky electronics (even though the show was supposedly set in 1997!), was best friends with The Robot. They looked out for each other and...
Read More

Uncovering web-based treasure with Sitemaps (SEM 101)

Have you ever noticed how pirate treasure maps are like Sitemaps? While your website may not contain a treasure of gold and silver (unless it’s a metals commodities trading site!), if you have good content, that is certainly treasure to someone who is looking for it. Unfortunately, it’s buried on your website and no one knows what’s there except you! But since you want to share your site’s treasure with others, you need to...
Read More