Thursday, 6 October 2005

Blamments

I wonder if blog comment spambots stop trying at a place that produces extremely low click-through rates, like my blog? I only got one spam comment today, as opposed to two the previous days.

What would work for me is a moderated system where I approve a comment before it goes live, because I get so few comments. That wouldn't work for larger sites unless the moderation was distributed over many people.

There's a way on the web designed to keep automatic trawlers at bay, called the Robots Exclusion Protocol. Email address harvesters obey it because of a "honeypot" script that's in place on several sites that presents them with as many fake email addresses as they can handle, with a little "robots keep out" sign on the door. When they disobey that sign, they become worthless.

Blogger already disallows robots from using the comment-making page, which means that all my blamments are coming from misbehaved robots. In other words, it is still worth the spammers' while to ignore the robots exclusion protocol. That's what has to change. I suggest meaningless depths of empty, functionless comment pages that lead to further comment pages and more and more, deeper and deeper, all the while achieving absolutely nothing.

When the spammers find their bots blindly following links and trawling around completely pointlessly, costing bandwidth and achieving nothing, suddenly the bots are a liability, not an asset, because bandwidth costs money (and even if you're not paying for it directly, wasted bandwidth is lost income). Then it becomes worthwhile to obey all the "robots keep out" signs on the web.

Mokalus of Borg

PS - I've been thinking about this a lot over the past few days.
PPS - Perhaps too much.

5 comments:

Pstonie said...

The the fake addresses pages struck me as a good idea whn I first read about them but I'm sure that, just as e-mail clients can find spam mails, spambots can tell the difference between a page full of (fake) e-Mail addresses and a normal page.

I think there could potentially be some other ways to check for them as well, such as cookies, user agents. Those can be faked. But what if you create a little script that tracks site access and blocks comment pages for sessions that are blatant spiders? I think that would work. If it loaded 14 pages in 2 seconds, don't let it access forms.

John said...

The thing is, when we start trying to check whether something is bad and excluding it, we fall into the arms race. They change their tactics, we change our screens, they change again. Let's start by assuming that the bulk of the comments I'll be receiving are junk, and try to tell which ones are good.

At the very least, I'd really like to have a way to prevent hyperlinks in comments.

Pstonie said...

I'd say the bad guys are already in an arms race and the good guys are getting left behind by not participating.

Since Blogger is one service serving may versions of an identical service, and since they now have the 'report this blog' button which worked well, they should consider a 'report comment spam' button. Say one blogger sees comment spam and reports it, then blogger confirms it and deletes all similar comments from the same IP on other blogs. That seems like a lot of database work but shouldn't be a problem for Google.

These are just ideas, though.

John said...

The arms race by enumerating what's bad is a losing game, because there is now far more bad on the internet than good, and more popping up every day. Instead of trying to keep up with every bad-guy tactic and stay one step ahead, we assume that anything new is bad and disallow it, until we find that it is, in fact, good. We can report spammers and prosecute them all we want, but there will always be more. That's why we have to get things in perspective.

If you haven't read The Six Dumbest Ideas In Computer Security, you should.

Pstonie said...

The guy makes some good points in the article, I don't agree with some others. I'd like to think that the rules are a bit different when looking at comment spam, but some basic priciples still apply.

Personally, I think the internet should be as open as possible to people while being as secure as possible. I don't like the idea of the reader of a site being forced to prove that they're human.

I suppose it's like living in a fortress with a mote seperating it from everyone, versus keeping the gates open and burying the murdered thieves in the back yard. I'd rather find a good way to screen out the bulk of the unwanted and deal with the ones that get through.