Call 0845 485 1219
We love digital - Call and say hello - Mon - Fri, 9am - 5.30pm
by Mike Essex on 26th April 2011
In a lot of ways Wikipedia could be classed as a content farm. It supplies masses of content, which is cloned across the web. There are unfinished articles that add no value and some articles are simply collections of information from elsewhere. However Wikipedia survived the Panda update which raised the question, why? What do Wikipedia do differently that other sites can learn from, and how did they avoid getting dragged into the same mire as (other) content farms?
This article explores why, and the lessons big sites with lots of content need to learn. So let’s compare Wikipedia against the Panda penalties:
James did a good roundup on which sites were hit by Panda, and one of the core reasons for why came down to excessive advert placement. Wikipedia avoid this issue by being advert free, except for between November and January each year, when they ask for donations with a plea from their founder:
Even with this ‘advert’, they avoid all of the other tricks by adsense sponsored domains, such as adverts in the middle of text, or adsense adverts that look like navigation. For a user journey it’s very easy to read Wikipedia and know that the entire screen space is free of adverts. The donations model isn’t something every site can emulate, but there are plenty of sites that get by just on having a PayPal button on their sidebar.
No affiliate / paid links
The majority of ‘advert free’ websites, choose to make revenue by stuffing their sites full of affiliate links, or text ads which are pretty much the same. That’s why so many review / coupon websites were hit by Panda. They simply couldn’t offer value above and beyond a list of affiliate links. Wikipedia avoids this trap by nofollowing all of their links, and stripping out affiliate links straight away.
For those of us who want to make some money from our sites, no-following links seems to be the way to go. In addition you need to have content that adds value around the affiliate links such as done by Money Saving Expert, who offer their own opinion on products, and then include affiliate links as their main revenue driver. Even when this is done they highlight affiliate adverts with an asterisk to make everything clear for the reader.
Whilst article sites like articlesbase.com, answers.com and ezinearticles.com are home to good unique content they also suffer from spam articles that have no value and exist just for a link. This brings down the good content and through a lack of quality control everything suffers. Wikipedia takes this to the opposite end of the spectrum, with moderators reviewing every change and most authors having to earn their stripes as an authoritative source before they get edits accepted easily. Copying and pasting web content won’t get accepted unless you can include a citation and even then most content is rewritten to make it unique to Wikipedia.
The three article sites mentioned above were all penalised by the Panda update, and have implemented strict review procedures as a result. If you accept user generated content without reviewing it, you need to do the same.
Very clear guidelines
It’s impossible to create a Wikipedia account without reading their guidelines and the moderators who approve content will refer to them whenever they make a change. This ensures people play by the rules when posting content and shows the search engines that Wikipedia is a site that refuses to accept low quality content.
Although not one of the 100 losses pointed out by Search Engine Watch, the writers of Squidoo reported noticeable ranking drops following the Panda update. This led to Squidoo creating a Scroll of Originality and an FAQ on Original content. They now have clearguides for their writers and are more in line with the Wikipedia content model.
In terms of the blogs that were hit with Panda, they tended to have navigational structures that had newest posts / hottest posts / most commented posts / top ranked posts / etc, which made it hard for the search engines to find a true list of all content. You may be surprised then to hear that Wikipedia with its 3.5 million articles actually has a contents page, and that it is far easier to find content through this method than the majority of blogs.
Through the contents page search engines can find and index every page that exists on the site. They can also find content through associated links at the bottom of content. There’s no duplicate content pages, and as Wikipedia has built up authority over time the search engines are prepared to dig through lots of content pages to find everything.
So, to copy Wikipedia you need to ensure every piece of content is linked through sub categories. If it can only be found via a search box then there’s no way of search engines finding everything. The content section on Wikipedia gives it structure, and moves it away from a content farm which can often be lots of little articles on the same topic.
Constantly refreshed content
Most content farm articles are written once and never refreshed. Based on the Demand Media Wired article, they write content but there’s no loop back to change it later. Instead they write new content instead of updating old. With Wikipedia anyone can update content at any time. If you try and make an article on an existing theme (even from a different angle) it tends to be moved in with the main article.
This creates stronger overall content, and gives people (and search engines) more reason to come back. This leads to a more consistent set of rankings than content which has gone stale. So before you write a new article, why not consider what old content needs refreshing?
We can’t all be Wikipedia, however there are things we can do to operate in a similar way that will help fight off the Panda assault. Wikipedia could very easily have turned in to a content farm, but they avoided the label through effective content monitoring and policies.
Have I missed anything, or do you feel that Wikipedia is really a content farm is disguise? Share your thoughts below: