The liability of loathsome, link-level web spam (SEM 101)

When I was a kid in high school, I used to go to the public library and do initial research in the Encyclopedia Britannica (yes, the bound book editions. I also remember black & white television with vacuum tubes and rotary telephones! Sheesh, I’m getting old!). I would pick up the index volume that contained the keyword I wanted to look up to identify which of the main volumes had the content I sought. But imagine this: when I opened...
Read More

The pernicious perfidy of page-level web spam (SEM 101)

In the exciting world of today’s Internet, where the world’s information is literally at your fingertips, where you can endlessly communicate, shop, research, and be entertained, spam is a big downer. The unwanted email spam that fills our inboxes also consumes huge portions of the available bandwidth of our routers and trunk lines. But email is not the only spam game in town. Web spam is the bane (well, one of the banes) of the...
Read More

Robots speaking many languages

We’ve already covered in past blog articles some of the basics about how webmasters can use a file called robots.txt to control how search engine crawlers (aka bots) crawl their websites. But there is so much more to talk about with bots. So let’s take a bit of a deeper dive into the subject. Topic 1: Using the proper text file encoding The robots.txt file is used by webmasters to either specifically define which files and directories...
Read More

Fixing 404 File Not Found frustrations (SEM 101)

You’ve seen it. So have I. Nearly every person who has actively browsed the Web for more than 15 minutes has seen it. I’m talking about the dreaded 404 File Not Found error. When it occurs, users simply abandon their search on that site and go elsewhere. That’s a potential lost sale, subscription, or download opportunity (aka conversion) for the affected site! It has been estimated that up to 10% of traffic to large websites on...
Read More

Prevent a bot from getting “lost in space” (SEM 101)

We recently published a non-SEM 101 blog post on controlling the crawl rate of MSNBot, the Bing web crawler (aka robot, or simply just bot). That got me thinking about robots. Naturally, that led to The Robot on Lost in Space. Will Robinson, the show’s precocious youngster who was a whiz at 1960s-style, clunky electronics (even though the show was supposedly set in 1997!), was best friends with The Robot. They looked out for each other and...
Read More

Uncovering web-based treasure with Sitemaps (SEM 101)

Have you ever noticed how pirate treasure maps are like Sitemaps? While your website may not contain a treasure of gold and silver (unless it’s a metals commodities trading site!), if you have good content, that is certainly treasure to someone who is looking for it. Unfortunately, it’s buried on your website and no one knows what’s there except you! But since you want to share your site’s treasure with others, you need to...
Read More

Crawl delay and the Bing crawler, MSNBot

Search engines, such as Bing, need to regularly crawl websites not only  to  index new content, but also to check for content changes and removed content. Bing offers webmasters the ability to slow down the crawl rate to accommodate web server load issues. The use of such a setting is not always needed nor is it generally recommended, but it is available for use by webmasters should the need arise. Websites that are small (page-wise) and...
Read More

Common errors that can tank a site (SEM 101)

Imagine being a content developer for a website. You write a bunch of clever and informative articles, which should deliver a good dose of new visitors and ranking potential to the site. You submit them to the IT department for publishing online, and wait for good things to happen. But instead, it all falls flat. A look at your web analytics tools reveals that the number of site visitors has not increased over the time your new material was...
Read More

Images and Flash and script, oh my! (SEM 101)

The old adage that was once attributed to a Chinese proverb tells us, “A picture is worth a thousand words.” But John McCarthy, a noted mathematician and computer science pioneer, stated conversely, “As the Chinese say, 1001 words is worth more than a picture.” Perhaps McCarthy was also a pioneer of search engine optimization (SEO)! The use of interesting and relevant images and animations on a website can certainly elevate...
Read More

Heads up on <head> tag optimization (SEM 101)

Much of what constitutes a well-architected webpage is never displayed in the page itself. The contents of the <body> tag are what you see in a browser. But a webpage consists of two major elements, the <body> tag only being one. The content of the <head> tag (and for that matter, the document type declaration (DTD), which precedes the <head> tag in the page’s code, is just as important for search engine optimization ...
Read More