Download this whitepaper now and get a new one every month!
We love digital
Call 0845 485 1219
We love digital - Call and say hello - Mon - Fri, 9am - 5pm
by Mike Essex on 20th July 2011
Hi, I’m Mike Essex, the Online Marketing Manager here at Koozai, and today I’m going to answer a question from the community, from Stephen Lock of Analytics SEO. Stephen asks, “What are the most annoying canonical issues you’ve ever seen?” Now, if you work in SEO for even a day, you’re going to
see some pretty annoying canonical issues, and these are just three of the worst offenders in my opinion.
Now for those who don’t know what a canonical issue is, it’s when you’ve got content that exists in multiple locations, either on your website or on multiple websites. The problems this can cause is that the search engines don’t know which version of the content to index, and they may pick the wrong one or they may pick somebody who’s taken your content and put it elsewhere. So mopping up these issues is very important.
How do you do that? Well, by adding nofollow tags to block off content that you don’t want the search engine spiders to get to, or by using redirects to say this is the correct content. You can also use canonical tags to say, “Actually, Google, ignore this version and focus on this version.” So those three issues solve most of the problems.
However, one of the main issues of this is that each website doesn’t have an unlimited amount of time from Google or Yahoo or Bing. The problem is that the search engine spider doesn’t have all the time in the world to look at your website. So what that means is it will only view you as many pages as the time dictates that it has. So if it’s stuck in a loop because of canonical issues, it’s missing out on good content that you want it to crawl. So you need to resolve those issues.
Now, if you got one of these three, you need to get on it straightaway, because these are typically horrible. If you run any spidering software on a site with these, it’ll usually break or never finish the function.
So galleries are first up, and the main problem with this is a lot of galleries have a page with a hundred images on it. Then they’ll also break those images down into category and sub-category pages, and maybe people can rate the images at five stars. They’ll group them by that. Then they might group them depending on the product that they relate to. The problem is that you’ve then got one image that appears on 20 separate different pages. It’s the exact same image, it hasn’t changed, and the search engines, it can look like you’re trying to game the system by having all these pages of the same image over and over and over.
So what you need to do is block off any of these subcategory pages from the search engines and just say to the search engine, “All the content you need is actually on this one main gallery page. Just index that and you’ll be fine, and don’t worry about these other ones.” If the other ones are a little bit redundant then just get rid of them altogether and redirect them back to the main gallery. So that’s that one.
Then e-commerce. So a problem here is that you’ve already got thousands of products, so you’re already short on time as it is. Don’t make that worse by then adding that product into all sorts of different sections. So if you’ve got a book, don’t add it to an adventure category and an action category, books under $2.99 section, top 100 books, top 20 books in action. All you’re doing is just saying, “Hey, here’s the same book. It’s the exact same book you’ve already crawled 20 times today, Google. Here it is again.”
Those pages are good for users, yes, absolutely, but there’s no reason for the search engines to see the same book over and over. Again, block those sections off. Use canonical tags to say actually, this is where you want to go for the book information, ignore these other pages.
Now if you’re Amazon – and Amazon’s terrible for this, they’ve got about 100 different pages for each product – you’re probably going to get away with it because you’ve got a lot of time for this search engine spider to look at your website because you’re Amazon. If you’re a small e-commerce website with hundreds of products, you need all the time you can get. So don’t make it worse with issues like this.
Last of all we’ve got blogs. Now, if you’ve got a WordPress blog or a Blogger blog, you’ve already got 20 canonical issues as soon as you take it out of the box, and that causes problems. For example, if you write a blog post on WordPress, not only does it have a page for the blog post itself, you’ve also got a page for the comments, the attachments, the RSS feed for that individual blog post, any category pages that it’s in, any tags that you’ve used. The author bio page will have a list of all those posts. It’s just redundant content over and over and over that isn’t needed.
Again, these sections, most of them are of no use for readers either. A list of items by the author, great. But a lot of the other ones, a list of comments on a separate RSS feed, there’s just no need. The comments are on the blog post. That’s good enough. So redirect all these pointless extra elements back to the main blog post, which is what you want to get indexed at the end of the day.
If you don’t do that, I’ve had issues before, where instead of the blog post being indexed, just one single image from that blog post has got indexed, which is just pointless. So you really need to consider these issues. The bigger the website, the harder it is to fix these issues, but the more of them you’re going to have and the more important it is that you do it.
We write about these kinds of issues a lot on the Koozai blog as well. So visit Koozai.com for more information. You can also use our Twitter, Facebook, or YouTube profiles to learn more about what we do or to ask a question for a future video, if there’s anything you’d like to know.