Tuesday, May 19, 2009

Preventing Comment Spam

I have taken up the coding challenge of dealing with comment spam.  As with most topics that I write about, I am by no means an expert.  But I have done a lot of reading recently, and here are some of my observations.

There is no majority consensus on the single best approach to prevent spam filtering.  Everyone agrees something must be done, but few people agree on which one method is the most effective.

The tactic that most people do agree on is that multiple techniques are necessary in order to achieve the desired levels of spam reduction, ease of maintenance and usability for visitors.  The business of spam is based on the idea that by getting a lot of content in front of a lot of users it is likely that enough people will respond to make a profit.  Countering spam is a process of making it difficult enough for spammers to post to your site that their time is better spent elsewhere.  The challenge lies in creating a system that is easy enough for your users to participate in that is at the same time complex or smart enough to discourage spammers.

The combination of most effective tools to employ varies depending on the site being targeted.  Your breadth of content and comment topics, user quantity and quality, and a host of other variables will determine the different tools that will achieve the best results fighting spam.  The larger and wider ranging each of those dimensions is, the smarter your techniques will need to become.  At some point, the easiest to implement approach may be to screen submissions by hand.

There seems to be a general feeling that if enough sites take steps to reduce spam, the web can be made a better place for everyone.  Spam will probably never go away.  If it does, it is likely that the infrastructure of the web was changed for the worse for everyone in some way.  But the idea is to make it difficult enough that spammers would make more money performing constructive services instead of annoying ones.

For a bit more on techniques used to prevent comment spam, check out this followup post.

No comments:

Post a Comment