Making links work for you (SEM 101)

Links can be the lifeblood of a good website, as we discussed in Part 1 and Part 2 of Links: the good, the bad, and the ugly. But how well you manage them on your site from a site architecture perspective can be the difference between your website being starved for oxygen (aka search engine referral traffic) versus healthy and thriving. That’s why we do search engine optimization (SEO).

This article is part 2 of the recent Site Architecture and SEO series. In the first article, I discussed how site architecture is related to making your site easier to crawl. By knowing what the search engine web crawler (also known as a robot or, more simply, a bot) needs to efficiently and effectively do its job, you can grease the skids to getting more of your content in the index.

Let’s take a look at what you can do to optimize your website’s URL and the links contained within.

Canonicalize your home page URL

We’ve talked about canonicalization and its relevance to large site issues in this blog a few times in the past. This series of articles isn’t the place for another deep dive on the subject, but the importance of the concept does bear repeating.

Search engines apply page rank to URLs. That makes sense, right? But check this out:

  • mysite.com
  • www.mysite.com
  • mysite.com/
  • www. mysite.com/
  • mysite.com/default.htm
  • www.mysite.com/default.htm
  • www.yourhostprovider.com/~mysite
  • www.mysite.com/en/us/

All of these various URLs almost certainly point to the same default.htm page at the domain mysite.com. However, because each URL is structured differently, each one is considered to be unique to search engines. And as a result, the “link juice” (the ranking value) that search engines attribute to each URL, even when they are going to the same page, ends up diluting what is attributed to the site overall. The concept of canonicalization is the way to express which single URL form you want to be used for your website’s home page. And once done, you should use that URL form religiously in your internal linking.

But what about inbound links? You can’t reliably control how other folks structure their links to you, right? Well, there’s a trick for that. Recall in the first site architecture post where I talked about setting up 301 redirects for moved pages? You can also set up 301 redirects for all possible permutations of your site’s home page URL and redirect them to the canonicalized URL. Once search engines begin hitting the 301s, they will attribute the link juice formerly attributed to the variant URL to your now canonicalized URL. Use the 301s to funnel all of the link juice you have earned to elevate the rank of your canonicalized URL instead of having it spread out over several URL variations.

Choose between relative vs. absolute links

To further emphasize your canonicalized URL, always use absolute URLs for internal links. What does this mean? It’s simple. Use the entire URL to point to the linked page rather than a file address that is relative to the home page of your site. For example, if your home page links to a page named contactus.htm stored in a directory named media, format the value for the href attribute to use the page’s full URL, as in 

<a href=”http://www.mysite.com/media/contactus.htm”>Contact us</a>

instead of the shorter, relative directory reference used for the page, as in

The use of absolute links reinforces the use of your full URL and, like canonicalization, focuses the link juice to that URL. Because of this, be sure to use absolute links in your intra-site navigation scheme, which is the most often used mechanism for accessing your internal pages. External, inbound links have to do this to reach anything other than your home page. Why not contribute a little bit of your own effort in adding to the link juice for your internal webpages?

Another reason to use absolute URLs is in anticipation against theft. No, not theft of your nice, new 24” monitor, but your more valuable site content. It is a sad fact of life on the Web that there are lazy folks out there who will simply copy and paste someone else’s content into their website. If you use absolute links for your inline links, your stolen content will most often take the reader of the plagiarized content back to the source—your site!

In the end, using relative links is not a bad thing. Not at all. It’s just that absolute links are a better choice for SEO.

Use proper URL syntax in the anchor tag

When referencing URLs in the href attribute of your anchor tags, following these internal standard URL formats will optimize the link for SEO. Consistency with this is especially important with your site’s canonical internal links.

  • For URLs that point to the default or index file of a directory, omit the default file name and instead end the URL with a folder name, always followed by the trailing “/” (as in href=”http://www.mysite.com/”>http://www.mysite.com/). This is even true for default file name URLs that use dynamic parameters (as in http://www.mysite.com/?var=1)
  • For URLs that are not the default file for a folder, it is fine to include the file name in the full URL, even if there is no dynamic parameter (as in http://www.mysite.com/article.htm)
  • For URLs that include ampersands (typically used between sets of dynamic attributes), substitute the equivalent escape code &amp; for the single ampersand character (&) to enable the page to pass HTML validation checks (as in http://www.mysite.com/?var=1&amp;var=2)

Use title attribute in anchor tags for internal links

Of course, if you’ve been keeping up with this column, you already know to use relevant keywords and phrases in the anchor tag text you write for your internal links. This helps search engines develop relevance between those keywords and the page to which they are referencing. To further develop keyword relevance for those pages, also include the title attribute to your anchor tags. An example might go something like this:

<a href=”http://www.mysite.com/newpage.htm” title=”keyword or key phrase describing the linked page”>Keyword or phrase about the content of the linked page</a>

Think of the anchor text as your primary description of the linked page. But if you do inline linking within the paragraphs of your body text, you need to maintain the natural, logical flow of the language in the paragraph, which can limit your link text description. As such, you can use the title attribute to add additional keyword information about the linked page without adversely affecting the readability of the text for the end user.

Identify the canonical URL for each page

We’ve already talked a little here about canonicalization and how that works for your home page. But what about other pages that can have multiple URLs? This is commonly a problem with sites that employ dynamic URLs. A large number of varying dynamic parameters can be applied to a URL that all ultimately go to the same page. But if you want to help the search engines determine which should be the canonical URL for a given page, even when that differs from the URL you are using in the links to that page, you can use the <link> tag within the <head> section of the page to identify the canonical URL for that page to the bot. An example looks like this:

<link rel=”canonical” href=”http://www.mysite.com/products.aspx?item=doodad />

This will apply even when there are no links on your site that actually use that URL (assuming all of the internal links to that page are employing additional dynamic parameters). Just be sure the URL listed as canonical actually resolves!

The canonical attribute is a relatively new feature for search engines, and while we are ramping up support for this new feature, think of this data as more of a hint rather than a directive to us.

Minimize the number of parameters in dynamic URLs

While we are talking about URLs with dynamic parameters, be aware that these can become problems for bots that want to crawl your site. Dynamic URLs are often used by affiliate sites to brand certain product pages that are otherwise identical content-wise, and the bots will pick up on the duplication. Indexing that data will often result in removing duplicates, and the version you wish to keep may not be the one that is reserved by the search engine index. Minimize the use of dynamic URLs as much as possible to reduce the incidence of this potential issue.

Now truth be told, MSNBot, the crawler used by Bing, can read and follow URLs using more than 30 variables. Limitations on the ability of the bot is not the problem. The problem comes down to that the random order of the variables and the number of variables used in a URL can create what is nearly an infinite number of permutations for the same ultimate content, and that duplication is the problem. As such, if you minimize the number of variables used, the fewer duplications there will be for your pages, which is good for getting the right page from the index onto the search engine results page (SERP).

Avoid using session IDs or cookies

Bots will fail to crawl your site if you attach session IDs to or require cookies in your links. Bots usually can’t accept such tracking mechanisms. As a result, they will never get access to the parts of your website that require these elements, which means those pages won’t get indexed by the search engines.

Have at least one internal link to every page

No man is an island. No HTML page should be, either! Every page should be linked to at least once by your other internal pages within your site. Otherwise the bot will never find it nor index it.

Avoid pages with nothing but a long list of non-contextual links

While the point of the Web is to have sites link to others in a web of interconnected pages, too much of a good thing is not always better. In fact, it can be bad! Pages that offer nothing but an endless list of context-less links are of very little value to users, and thus are not given much value by search engines. Again, the old adage of pursuing quality over quantity usually holds true.

Now if you actually have a bushel or three of very high quality and relevant outbound links, which are presented in logical context for the user, bully for you! And of course, an HTML sitemap page can contain numerous internal links, which is not a bad thing, either (but at least try to organize the sitemap list of links so they are easier for the user to consume). Just note that very long lists of links with no context is not helpful for anyone, including you.

Prevent the bot from following a link

Use the rel=”nofollow” parameter in your anchor tags to identify pages you don’t want followed, such as those to untrusted sources, such as forum comments or to a site that you are not certain you want to endorse (remember, your outbound links are otherwise considered to be de facto endorsements of the linked content). An example of the code looks like this:

<a rel=”nofollow” href=”http://www.untrustedsite.com/forum/stuff.aspx?var=1”>Read these forum comments</a>

To block the bot from indexing content on your site (such as authentication or shopping cart pages, to improve crawling efficiency), don’t do it through this attribute in the anchor tag. Instead, use either your robots.txt file or add a <meta> tag using the robots attribute to block access of the bot to that content.

We’re still ramping up on how to make your site architecture more robust for SEO. Next up in our site architecture series: content issues. If you have any questions, comments, or suggestions, feel free to post them in our SEM forum. Catch you later…

– Rick DeJarnette, Bing Webmaster Center

Join the conversation

86 comments
  1. Quality Directory

    I have fixed all the canonicalization issues my websites once had. I rectified things after reading a similar article at MSDN Webmaster Blog and Matt Cutts' blog sometime ago.

  2. Anonymous

    good information,have to read all parts.

  3. Anonymous

    Great title="" advice for links.  I have often wondered if you found it of value and I hate reading content that has been crammed full of text in links that looks unnatural.  Also, thanks for the clarification on the use of rel="nofollow", it is nice to see some transparency and usable tips.

  4. Anonymous

    Good guide to SEO, from a major search engine (so they should know how to do it :). Canonical URL's are important.

  5. Anonymous

    Nice Post

  6. jpluo

    not bad

  7. Blackpool UK

    brilliant seo information from bing super!

  8. Anonymous

    Thanks for your information, I Like This

  9. Anonymous

    Diluting link juice is a really bad idea, especially when competition is high.

  10. Anonymous

    Very Useful information, Thank you!

  11. Blackpool UK

    This information is very useful.

  12. sania

    Most valuable information.  Thanks a lot!

  13. FaTu

    Thanks a lot Rick.

  14. pelindemir

    Thank you very useful information.

  15. silvercell18

    Thanks for the blog posting.  I think that it is very informative.  I think that websites with strong relevant links help out present sites with that add value to the searches.

  16. daniel.bomgardner

    Awesome!

  17. Ilyas Kazi

    A perfect information on optimizing our web links.

  18. jackin

    Reading to this post many of my doubts became clear related to dynamic url thanks a lot for a great info.

  19. y.adarsh

    very nice article.

  20. Anonymous

    We do the same at Adeo, A good URL is very important.

    Allen

  21. alextampa

    Great info there

  22. bradross

    Excellent information to get my sites links in order

  23. Anonymous

    The title="" advice was great as well as the whole article thanks.

  24. leo_charles

    Great information, however you made an 'amost' duplicate mistake under your section:

    "Choose between relative vs. absolute links"

    Both examples you listed are absolute URLS. The bottom one needs to be changed to relatives.

    -Cleo

    http://www.lunarstudio.com

    Architectural Renderings and Illustrations.

  25. rickdej

    Hey folks, I got some late feedback from some internal folks on this topic and I made some minor modifications to this article. You might want to go back and reread it just to be sure you have the latest and greatest info! Thanks!

    Rick DeJarnette

    Bing Webmaster Center Team

  26. Lucca Sofiato

    Very useful!

  27. VinoCanada

    I'm still confused about the nofollow tag.  Google says they still crawl the link, but don't share link juice to the nofollow link.  Bing, in the article above, says that they simply do not follow the link at all.  

    Which is accurate?  Or perhaps Bing and Google treat nofollow differently?

    Phil

    http://www.vinocanada.com http://www.icewinetales.com

  28. frankwilliams364

    That was very good  and i am new to this and web site it is plenty to learn and new web site owener

    Thanks

    Frank

  29. Anonymous

    The problem with absolute URL's for local content is that it can cause the web page visitor to endure an additional resolution of the site URL to its corresponding IP address (DNS Name Resolution).  It may help your search standings, but it will decrease the usability of the site on machines that aren't performing DNS caching.

  30. Bala Sounder

    good information

  31. Anonymous

    Im glad I found this.  I am definitely coming back.  I am in the process of building links for 4 websites and need good advice and information.

  32. nexcheck

    canonicalization is a vital point which so many SEOs and marketers usually ignore.

  33. Anonymous

    Some good information. Thanks!

    What I'm wondering though is why search engines are making some things so complicated. Isn't it in your own interested to find the most relevant content pages for any given keyword? For instance, why don't you figure out yourself that non-canonicalized index pages still do have the same content? Why not just grouping domain.tld, domain.tld/, domain.tld/index.php etc. together looking at it as the same page? By not doing this you're probably missing to give approriate link juice to great content pages (which your search customers are expecting in your SERPs!) just because that webmaster doesn't care about canonicalization …

    Similar with absolute URLs: absolute URLs for internal links are bloating the HTML code, making the pages load slower, i.e. your bot takes longer to index it (and of course, as mentioned by someone else here, it causes additional DNS lookups for the user browsers in many cases.) It should be easy for a search engine to resolve relative links to give them the same link juice as absolute URLs, doesn't it?

    Just a few thoughts that might make your (and our) work easier and to avoid too much spam SEO vs. really good content that in many cases is NOT being produced by SEO experts but by experts on the specific topic (and hence their sites are not perfectly SEOed). Solving such issues on the search engine's side instead of asking the webmasters to do it would help the search engine getting besser results and a more homogeneous index, content quality-wise.

  34. cyber__fr

    Very interesting but where better in french ;)

  35. www.thestocksprofit.com

    Thanks!!

    Shadi

  36. Anonymous

    This is the best blog on seo I have found – google doesn't even provide any content written as well as these articles are.  Thank you Bing for providing such a great resource.

  37. fitnessdad

    Please keep up with these how to's……

  38. PCS

    And my post just showed that I dont learn to quickly! lol

  39. Anonymous

    I like the information.  It was very helpful.  I learned a great deal from reading this blog.  

  40. Anonymous

    Thanks Admin

  41. Anonymous

    Excellent SEO Information! I appreciate your great post. Thanks!

  42. miles2go

    Vital piece of inforamtion for all SEO practioners.

  43. Anonymous

    Hello guys I am new to seo and i am figuring it out why none of my pages are indexed by bing…

    Here is my site

    http://www.jordandunkshoes.com

  44. Anonymous

    That's great, I never thought about Making links work for you like that before.

  45. Anonymous

    I learned so much from this article and I will be using it all on my site, thanks!

  46. Anonymous

    Thanks for your information! understand more and more

  47. Anonymous

    great information , one question ..

    Does rel="canonical" work for other Search Engine ?

  48. Anonymous

    Hey, you have a great blog here! I'm definitely going to bookmark you! Thank you for your info.

  49. Anonymous

    Thanks for informations

    Good job!

  50. Anonymous

    Thank for the info

  51. Anonymous

    Thanks a lot

  52. Anonymous

    Thank for information

  53. Anonymous

    do you can tranlate to Thai?

  54. Anonymous

    Thanks you.

  55. bailey_cross

    The .htaccess file can be used to redirect /index.php to /.  

    I think its also worth mentioning that there are some bots that you may want to block from crawling your site.  This can also be done via .htaccess file.

  56. untacticcal000

    Very Informative but I actually want the Bots to follow me because I love spiders & crawlers.

  57. Anonymous

    Thanks for the info

  58. Anonymous

    Good Job. Thank You.

  59. spackindia

    Thanks for such wonderful and clarifying most of the issues.It becomes like bible coz its coming from a major search engine officially.Otherwise we mundane guys can only guess that what works or not.

    Again thanks a lot from bottom of sem professionals heart.

  60. Webhostuk LTD

    Hello,

    thank you for this valuable information this will help all the newbie to understand search engine concept.

  61. simplify3

    Three things i hadn't really thought much about, and I appreciate this artice for these reasons:

    1) The importance of "Title".  A LOT of my links don't have Title.

    2) Using <strong> — I had forgotten about that value.

    3) using absolute rather than relative urls.  Man, I have a lot of reengineering to do do make that happen, but it'd be totally worth it.

    Ken, Naples, FL (http://free.naplesplus.us)

  62. Anonymous

    Thank you Rick.

  63. Anonymous

    Thank You.

  64. Anonymous

    I got the lot of ideas and understand why http://www.makaan.com pages are not indexed so much.

    Finally, Very useful for us and who wants to better SEO result in Bing search engine.

  65. Anonymous

    Thanks, great info

    Evan

    SEO/SEM

    <a href="http://www.twiddy.com title="outer banks rental">Twiddy.com</a>

  66. Anonymous

    Thanks for this valuable information.

  67. Altamiraweb

    Live is getting much better!

    Great post!

    I´ve already subscribed the feeds!

  68. edmondc

    Hi Rick,

    As always, great post! Thanks for the info (sometimes, even if you read it all before, it only starts make it sense when you here it from someone you trust!). Keep up the good posting.

    One question: what about human-friendly URLs? How important is this for Bing? Is it worth investing the money to take something that has got */tabid/38/…/Contact-Us into just */Contact-Us? As you know some website packages do this and external SW is required to bring those websites to displaying links that are true human readable.

    Thanks,

    Edmond

  69. novintabligh

    great article and information on link structure and duplicate content.

  70. ninetong

    thank a lot ^^

  71. mscott556

    This is information I didn't have, that I'll be putting into practice immediately! Thanks, Bing!

    Web manager

  72. Chuduvietnam

    Thanks your information, the information was helpful for me

    <a rel="nofollow" href="http://www.satsco.com.vn”>www.satsco.com.vn<a>

  73. vikcer

    thank for deteal get link

  74. vikcer

    <a href="http://www.healthycure.info">Healthy Cure</a>

    <a href="http://www.whitemilk.info">White Milk</a>

    <a href="http://www.livescience.info">Live Science</a>

    <a href="http://www.toysandgames.info">Toys and Games</a>

    <a href="http://www.cookingfood.info">Cooking Food</a>

    <a href="http://www.dietjuice.info">Diet Juice</a>

    <a href="http://www.teacakes.info">Tea Cakes</a>

    <a href="http://www.kidrobot.info">Kid Robot</a>

    <a href="http://www.allfoods.info">All Foods</a>

    <a href="http://www.golfclinics.info">Golf Clinics</a>

    <a href="http://www.tourstravel.info">Tours Travel</a>

    <a href="http://www.dogskinconditions.info">Dog Skin Conditions</a>

    <a href="http://www.penandpaper.info">Pen and Paper</a>

    <a href="http://www.adjustablefloorlamp.info">Adjustable Floor Lamp</a>

    <a href="http://www.gamehints.info">Game Hints</a>

    <a href="http://www.footballarticles.info">Football Articles</a>

    <a href="http://www.articlegold.info">Article Gold</a>

    <a href="http://www.dog-articles.info">Dog Articles</a>

    <a href="http://www.myhealthyeating.info">My Healthy Eating</a>

    <a href="http://www.healthycooks.info">Healthy Cooks</a>

    <a href="http://www.onlinegamesguide.info">Online Games Guide</a>

    <a href="http://www.baby-nursery-bedding.info">Baby Nnursery Bedding</a>

    <a href="http://www.baby-walker.info">Baby Walker</a>

    <a href="http://www.baby-bouncers.info">Baby Bouncers</a>

    <a href="http://www.herbalhighs.info">Herbal Highs</a>

    <a href="http://www.healthyrecipies.info">Healthy Recipies</a>

    <a href="http://www.healthy-body.info">Healthy Body</a>

    <a href="http://www.healthyhabit.info">Healthy Habit</a>

    <a href="http://www.heavenlycakes.info">Heavenly Cakes</a>

    <a href="http://www.wedding-cupcakes.info">Wedding Cupcakes</a>

    <a href="http://www.coffee-cup.info">Coffee Cup</a>

    <a href="http://www.teakcoffeetable.info">Teak Coffee Table</a>

    <a href="http://www.gardenpoultry.info">Garden Poultry</a>

    <a href="http://www.healingnaturally.info">Healing Naturally</a>

    <a href="http://www.allnaturalcures.info">All Naturalcures</a>

    <a href="http://www.foodsteps.info">Food Steps</a>

    <a href="http://www.bravofood.info">Bravo Food</a>

    <a href="http://www.foodmatrix.info">Food Matrix</a>

    <a href="http://www.biomineral.info">Bio Mineral</a>

    <a href="http://www.biomos.info">Bio Mos</a>

  75. Faris Riaz

    can bing provide better serch like google

Comments are closed.