Duplicate content has been in the SEO headlines more and more recently since the notorious Farmer / Panda update began.
This post talks about some of the ways to deal with some duplicate content on a site including Robots.txt, Meta tags, redirects and canonical tags. Each solution is explained in a simple to understand fashion.
You can use your robots.txt file to block content which you do not want indexed. With this option you can block entire groups of URLs if they are within the same folder, for example /price. This is ideal if your duplicates are all produced with the same pattern and you can identify them through the folder structure. If your duplicates are completely random, you can also add individual page URLs to be blocked with a robots.txt
You can also use a Meta robots tag to ask search engines not to index content. This should be placed on every page which you do not want indexed. This example makes sure the links on the page are still followed by the search engines and it is just the one page which is not indexed:
<meta name=”robots” content=”noindex,follow” />
You can place permanent 301 redirects on the URLs of the duplicate content, to point users and search engines to the standard version. This can be very effective providing a permanent 301 redirect is used.
The canonical tag is another option for telling Google which pages you do and do not want indexed. The canonical tag should be placed on all the duplicate pages and states which is the standard version which should be indexed.
<link rel=”canonical” href=”www.the-correct-url-here.com” />
Whichever solution you use for combating your duplicate content, it is important to remember to always use the standard version of the URL when creating internal links.
Copy Paste Keyboard via BigStock