Spambots attacking my forum, need information on building a more secure system.

Welp, rather than making a gigantic explanation, here’s what’s basically happening:

Earlier this week I learned that Kemoy translates into “garbage” in Japanese, so I found a bunch of annoying little bots running roaming around the forums.
Any help on creating a more secure system? I’m trying to avoid activation quizzes as much as possible, but if I have to do so, then I will. And yes, I am using SMF.

Make a unique register form.
Since java-gaming is populair, an activation quiz is the last resort against these bots.
With some unknown site, you could make the register form as easy as “type the following word in the textbox” to stop these bots.

From most effective to less effective:

  1. Tell Google not to index your site (Easiest. Incredibly, incredibly effective)
  2. Manually activate each account and put them in different moderation groups if suspicious. (Very effective)
  3. Change your registration page name and make sure things like copyright notices won’t appear in search engines.
  4. Use a topic specific quiz. (Second easiest)
  5. Make website and signature inputs on registration hidden. Ban anyone that POSTs something for those inputs. (A honey pot)
  6. Refer to Stop Forum Spam.
  7. Use nofollow links.
  8. CAPTCHA (including generic quizzes)

Number 2 or 3 combined with another technique is much better than just 2 or 3.

Alright, I’ve turned off search engine indexing, and I’ve also turned on e-mail activation.

I forgot how god-awful the SMF defaults were.

There’s also useful plugins that block logins/posts/registrations from users without useragents, refer headers, proxy IPs and a few other obvious spam-bot differences.

But remember, nothing will stop someone legitimately signing up on your site, to then use an auto-posting spambot.
Even if you include CAPTCHAs, you can’t stop everything.

Generally it is better to reduce the value of spam on your site and put in a few obstacles to trip dumb spam bots than to try to profile bots. Anything you profile can be masked, even in the most extreme case by running your bot in a real browser or using a real person. No matter how good the profiling is, people will still find a way to spam your site if the benefits outweigh the cost. Since spamming can be highly distributed (to the point of using zombie computers), you can still get frequent spam even if you slow down individual users. You also run the risk of cutting out legitimate users with profiling. (Referral headers can be shut off in standard browsers, there are good reasons to use proxies, and you could have users using a client you do not recognize.) If you have other methods to regulate spam, then it is more likely that adding profiling will end up blocking Richard Stallman from your site than an extra spammer.

Human registration with automated posting is not much of a problem. Someone has to look at your site and judge that it is worth spamming first, which it probably won’t be if it takes work to register and it seems well moderated, since spam can be removed in bulk when you have user ids to match it.

Honeypotting is the only profiling method that can be made reliable for automatic banning, since certain actions won’t give you a false positive. (Someone POSTing to login.php instead of signin.html - Someone visiting a forbidden directory only listed in robots.txt - Someone filling in fields that don’t exist - etc.)

Adding an e-mail confirmation helped a ton. Thanks a lot guys!

Its a trap!

Didn’t know that term. Highly interesting technique, thanks for making me aware of it.

I don’t understand this point.

Also, your usage of the term ‘honeypot’ seems a little different to mine. A honeypot (for me) is something masqerading as a normal server/site but is actually a fascade for the owner(s) to monitor/follow/log user activity…kinda like watching labs rats in a maze. I don’t see how this applies to securing forum software?

The goal of a honeypot is to profile connections with low false positive rates by catching someone in the act. Observation beyond that is of diminishing utility unless you use it as a source for a classification algorithm. The goal is to put your honeypot and service on the same site and banning bots caught by it.

If I have a PHPBB forum and I see clients submitting requests to index.php?action=register I know it came from an automated source. And if I see requests to ucp.php?mode=register when I use SMF I also know I have an automated attack. If I create a page called ucp.php for the sole purpose of detecting forum spam bots on a SMF site and generated temporary bans of those IP addresses, then that is a honeypot on a server which also contains a real service. You might also create a honeypot by telling robots to avoid a certain directory and logging IPs of clients deliberately disobeying robots.txt or by giving spam bots a spot to include a url (for their user profile) that will never by seen by people and logging attempts to jump the gun to post links on your site.

Since spam comes from two sources (automated drive by attacks which involve dumb robots and targeted attacks that involve human judgement) you need something to protect you from the dumb robots and something to signal your site is not worth the effort. Dumb robots use profile URLs as part of their strategy because it gets their URL on your page without them needing to trick spam filters that look at links in posts. If your site does not have a website profile field or you change its name and if a robot posts with the old parameter name anyway, you know it was a bot at can ban it at the time of registration. The honeypot is a single profile field that bots POST to by default even though you may still provide legitimate users a different way to designate their website.