If you are working hard to get your web pages to rank well in the search engines then I would encourage you to have an active policy for how you manage duplicate content.
When a user searches Google, Yahoo or Bing the search engine tries to return a good spread of results for the term they typed in. One of the steps they take to achieving this is to remove URL’s that actually point to the same content and they do this using the following logic:
- They choose one URL from a group of similar URL’s that point to the same content (examples below).
- They select which URL that THEY think is the ‘best’ URL to show in the search results (unless you tell them otherwise).
- They try and apply factors like link popularity to make this decision.
If you leave it to the search engines to decide, they may choose the wrong URL.
In many cases they may not detect that two URLs are actually pointing to the same page, which can dilute the strength of that pages’s ranking by splitting it across multiple URLs. Alternatively you can tell them which URL you’d like them to show.
Here are some examples of the sorts of URL’s we are talking about.
- www.example.com/product.php?name=shoes&color=black&brand=clarks
- www.example.com/product.php?brand=clarks&color=black&name=shoes
Two ways you can let the search engines know which page you’d like them to show in their search results page are:
- By including your preferred URL in your Sitemap.
- By using the canonical link hint which tells the search engines which page YOU want them to consider the ‘best’. Another post covers how to use the canonical link.
By implementing a policy for managing duplicate content you will increase your chances of achieving the results you are hoping for from your search engine optimisation work. Without such a policy you could be inadvertently undermining that work.