Architecting content for SEO (SEM 101)

It wasn’t that long ago that I discussed in this blog how to create good content that will get your site noticed, by both end users and search engines. But to be clear, just writing some slick text is not the whole story. The previous blog articles in the Site Architecture and SEO series (files/pages and link/URLs) made reference to doing what you can to help the search engine web crawler (also known as a robot or, more simply, a bot) crawl your site. This article is part 3 of the series.

Let’s take a look at what you can do to optimize your website’s content from an architecture point of view.

Set up your content pages like an org chart

When planning the content design for your website, organize it in a broad-to-specific flow of information with an emphasis on shallow paths. Your home page is the introduction to your content. Unless your site is tiny, save the deep dives in content for other pages. Instead, introduce your content theme, provide basic overview information, and present the information navigation scheme for the site. Thinking in terms of building an organizational chart of content will help you “bucketize” your content into logical groupings and landing pages.

However, this information categorization effort can go in the wrong direction if you are not careful. Instead of a long drill-down from the home page through folder after folder to get to the good information you offer, keep the content organization closer to the surface. You don’t need to plunge deep as you get into detail. Instead of going vertical, expand your content horizontally. Stay shallow, using many first and second level directories instead of burying your content in deep silos. This pattern of information flow will help users more easily find what they want to see and help the search engine bot crawl the information on your site.

Use <strong> tags to highlight keywords in body text

We’ve already discussed how to use keywords in your pages (in <title> tags, meta tags, heading tags, anchor tags, and the like). This selective placement of certain words emphasizes the value of those words as descriptors of the content on your page. But what about using keywords in body text? Yeah, it is still important to judiciously sprinkle your keywords in your body text. That helps the reader understand the content of your page, and a modest (and I do stress modest!) bit of repetition in the use of those keywords from the aforementioned strategic areas again in the body text reinforces your use of them.

But is that the only way to stretch your keyword dollar? Well, if you encapsulate your body text keywords with <strong> tags, this will add a bit more emphasis to the relevance of those words relative to the theme of the page. It’s not a big emphasis, mind you, but sometimes a lot of the little things can add up in value. After all, a pocket full of dimes can be worth more than a $5 bill!

Ensure text content is not shown via scripts, Flash, or images

Make sure the bot can read your content. Text is easy to read for any browser, even one as simple and as formatting strict as a search engine bot. However, bots are not human. They can see references to images and animation files, but they are hard pressed to read that content. The same goes with scripts. Bots can see the <script> tags, but since bots don’t execute the script on the page, any page element requiring the script to run so that it will be presented will be missed by the bot.

Folks who fill up their websites with images representing pretty text content to be read are making a mistake if they also want to be effectively crawled by bots and indexed by search engines. Let’s say you run a website that sells kitchen knives. Your text content about your products is read by the bot and indexed by the search engine. But an image of a chef’s knife has no indexable meaning to a bot. And neither does the a banner graphic image containing the words “Matt’s Cutlery Shop” splashed across the page.

Images are dead zones to a bot—there’s no information there to collect. If you want the text on your site to be indexed, you should put it in the page as text, plain and simple. And for the times that you do use images, you need to use ALT attributes with the <img> tag to describe the content of the image (sprinkle some of those magic keywords here if they are relevant!). That’ll at least draw some relevance of the image on the page to those keywords, even if the images themselves are not indexed.

If your site uses animation technology, such as Silverlight or Flash, there’s good news and bad news. First the good news: bots are getting better and better at extrapolating text content from these sophisticated presentation technologies. But now for the bad news: it’s still a hit-or-miss game (frankly more miss than hit), and the use of these technologies is ultimately not a good bet for SEOs.

No one is saying to never use animations. The key strategy for using Silverlight and Flash on your website is to have these be elements of a page rather than representing the entirety of your page content. That way, you can still include easily crawlable text and strategic tags for keyword usage. Try to use these technologies for image instead of text presentation. That way you still get the “wow” factor they offer to users, and your pages remain fully available to be crawled and indexed by the bots.

For what it’s worth, if you do put some text in Silverlight and Flash content, be strategic about it. Sprinkle relevant keywords there as well. Who knows? Maybe the bots might be able to make heads or tails out of the animation (as I said, things are improving in that area). At a minimum, it will promote the association of those keywords with your content to the user.

Avoid redundant content between pages

This is often a problem for sites that have affiliates sellers of their products and use dynamic URLs to distinguish which affiliate’s logo to show or attribute the inbound traffic to. But occasionally folks copy content between pages, and as Martha Stewart might say, “This is not a good thing.”

Basically, redundant content is recognized by the search engine during indexing, and often redundancies are eliminated from the index, or at least they are removed from the search engine results pages (SERPs) for relevant queries. It’s not likely you want your content to be removed in a de-duplication process when a user searches on keywords for which you’re most relevant, so make sure your pages don’t duplicate content between them.

Also avoid duplicating someone else’s content. Plagiarism is never good, so write your own thoughts on your subject of interest. Search engine indexes hold a lot of information, and that allows for detecting duplicates and plagiarism. You don’t want to be penalized for stealing someone else’s content, do you? But I bet you wouldn’t shed any tears of sadness if someone else were penalized for ripping off your content expertise and presenting it as their own, would you? When you find great and relevant content on a 3rd party site, just link to it! That’s how it’s done properly, and that is, indeed, a good thing.

Organize your content with header tags

Organize your content on your page. Think in terms of writing a theme from back at school. For each page, start off with a big idea. Then break the idea into smaller parts and develop them. Introduce the big idea with an <h1> tag. Use <h2> tags to introduce the smaller parts. Don’t worry about the weirdness of the formatting that is inherent with these tags – you can use CSS to change that.

The key here is that the <h1>, <h2>, and deeper tags (all the way down to <h6>) are regarded by the bot as more like XML than HTML in that they describe the data they contain (HTML merely focuses on the display of the information). The bot responds to this content organization by attributing the words used in the heading tag text as relevant keywords for that section. Take advantage of that concept to strengthen your keyword relevance for the bot to pick up.

Keep in mind the following tips for using header tags:

  • Only use only one <h1> tag per page. No, you won’t be considered web spam of you use two, but you diminish the value of the <h1> tag if you use more than one (after all, there is supposed to be only one big idea per page, right?).
  • Within the <h1> tag, feel free to use one or more <h2> and deeper tags per page. Use those <h2> – <h6> tags to help the bot discern the priority of content on the page.
  • Use keywords within header tags, but don’t copy the text of the title tag (that doesn’t help advance any specific ideas, and thus it’s a lost opportunity for more keyword affinity).
  • Limit header tag text to 150-200 characters.
  • And don’t use header tags on images (again because the bot can’t read images, thus this is a lost keyword opportunity) – stick with text.

Make page copy actionable and unique

When you write the content for your pages, always think about the reader. Why are they going to be interested in this? What will motivate them to come back or, better yet, link to your page? If you make your content compelling, you’ll earn fans. And to do that, you need to make your content actionable. Make it interesting, give your readers something they can do, and good things will happen.

Don’t use hidden text or links

Never place hidden content on your website. That includes things like text and links that are formatted to be shown in the same font color as the background and/or in the tiniest possible font sizes. Search engines have seen this trick played out time and again by folks who have nefarious goals of misleading both search engines and thus end users. As a result, these techniques are considered to be behavior of web spammers.

Don’t allow your legitimate site to be penalized because it used web spammer techniques. The search engines can see when such techniques are used and discount any value they might have added to a site’s relevance to the hidden keyword text, so it doesn’t even work, anyway. And the ensuing penalties imposed on sites employing this practice actually make this a very damaging technique. That’s the antithesis of good SEO!

We’re now cresting the hill but not yet finished with advice on how to make your site architecture more robust for SEO. Next up in our site architecture series: <head> tag optimizations. If you have any questions, comments, or suggestions, feel free to post them in our SEM forum. Until next time…

— Rick DeJarnette, Bing Webmaster Center