Weblogs: Spam

SEO and blog comment spammers - an idea for a solution

Saturday, November 15, 2003

Comment spamming is one of the biggest gossip subjects on blogs in the last couple of months. Its annoying and abuses the open and free spirit of blogging. Comments are one place where readers can be part of the community, where ideas and viewpoints are exchanged, not for self-promotion at someone else's expense.

The Search Engine Optimisation obsession

People engaged in search engine optimisation (SEO) are insanely jealous over the respect Google gives to blogs. Yet in the same breath they reject bloggers as a group of spotty teenages just linking to each other with a "like what he said!". SEOers who pride themselves in the knowledge of how Google operates and how PageRanks are assigned, yet they are flummoxed by something as simple as an updatable and updated web site.

SEOers are deeply concerned about PageRank - its an obsession. Discussion rages as to whether internal page links in a website increase PageRank. Their obsession goes as deep as to figuring out what the optimal number of internal links are.

But they are not really concerned enough about PageRank to acceed to any suggestions that their content needs serious work. Advice geared around web standards and accessibility falls on deaf ears to this very group of Google listing addicts.

Inbound links

So SEOers and spammers have been hitting Moveable Type powered blogs with tools ranging from human-powered comment form entry through to automated scripts to populate blog comment areas with something small enough to allow a link to their site to appear.

It is their understanding, because well-read blogs rate very well in Google, blog spammers will gain PageRank by getting a link to their website within that blog's comments. They refer to it as "PR leaking".

SEOers have in the past spammed guestbooks just to have one more link back to their websites. Recently Google has cracked down on guestbooks, but its a difficult job catching all the guestbooks being abused by SEO spammers, so a lot of guestbooks are still indexed by Google, and leaking Page Rank to websites participating in guestbook spamming.

The solutions so far

So we know that the reason SEOers are spamming blogs: to improve the PageRank of their website in Google. At the moment, the solution amongst bloggers is two fold:

An obvious solution

I'm a little surprised the blog community have missed an equally obvious solution. Why do comment spammers plough their links into blogs - they are only interested in one visitor: Google. So what will happen when Google doesn't see the link?

The solution I'm proposing is as follows: Keep your comments area open, but when Googlebot visits, remove all hyperlinks from comment entries. Prevent blog spammers from the very thing they are trying to achieve - a linkback.

Of course, this does make the assumption that the blog software dynamically generates pages. For blogging software that generates static pages, it is feasible - on an Apache server - to use mod_rewrite and look for Google's signature and deliver a dynamically generated representation of that page sans the comment links.

Derivative solutions

Another possibility with this idea is to only allow links in comments from people with a good history of blog comments. A trust-based / moderation system. This links back to the community discussions about moderation systems for comments and dealing with trolls.

Warning to bloggers

SEO practitioners are targeting blogs with a high PR. They have already identified blogs centered around the Howard Dean presidential campaign as blogs "ripe for the picking". They've already made what amounts to a declaration of war on the blog community.

Update 20 November: Mark Pilgrim notes a very compelling reason not to give Google different content as I've described above:

The practice of delivering specific content to search engines that is different from what your users see is called cloaking. Cloaking for any reason will get you permanently banned from Google. [link]

That's good enough for me to consign that particular idea into the bin. The alternative involves delivering everyone comments without links, and that has no advantage over Jay Allen's MT-Blacklist and starts running into some un-community like problems.

Related Reading:


[ Weblog | Categories and feeds | 2011 | 2010 | 2009 | 2008 | 2007 | 2006 | 2005 | 2004 | 2003 | 2002 ]