Referrer and Comment spammers are a PITA.

| 3 Comments
This shouldn't be news to anyone - but Referrer and Comment spammers are a real pain in the a*se.  Polluting my web logs and making any meaningful log analysis problematic.

So, I now have an itch to scratch and I'm going to do something about it. I would encourage you, the reader, to do something about it too.

Firstly, get yourself over to Project Honey Pot and read up on the project.  If you can, set up a Honey Pot or two yourself. Also be sure to read about the http:BL - this works along similar lines to the DNS blacklists used for Email spammers.

Next, I'm going to write a general Apache mod_perl module which will provide integration (lookup) to the http:BL and allow the user to "action"* the abusers.  Minimally, it will prevent the normal apache log files from being polluted by diverting the log entries to a httpbl logfile.

* "action" - To provide flexibility, I'm thinking of running an external script with the IP of the abuser.  The script can then perform any action you wish. The one I'm going for is an iptables firewall block.

Comments and suggestions welcome.

Project Honey Pot has implementations for several languages, including PHP and Perl (the languages that mean most to me).  There may be an implementation for your Web application so you might not be interested in what I'm doing at all :)

projhoneypot_banner.jpg



Bookmark and Share

3 Comments

Hey, I like this post. Keep them coming. thanks

Oh the irony.... I approved this "spam" comment to prove the point (after editing "jade's" link to not work).

Posting IP: 122.36.165.202

My module log shows:
[Tue Nov 30 05:50:13 2010] HTTPBL: 122.36.165.202 (8) "127.22.8.1" "www.pgregg.com" "/mt/mt-comments.cgi" "http://pgregg.com/blog/2010/11/referrer-and-comment-spammers-are-a-pita.html" [HTTPBL:8]

This would have been caught, and when I get the firewalling going, stopped.

I have victory in phase 1 and 2.

Phase 1: Implement a http:BL module to identify the "bad guy" and divert the log entry into a httpbl.log instead of access_log

Phase 2: I wanted to automatically firewall the malicious IP address, but the apache UID is unable to use iptables (it would seem prudent for iptables to allow a specific CHAIN to be created and permit another UID to control it - shame noone has done it yet). My options then became some form of message passing to a root daemon, or suid-root. I was happy with neither, so implemented an apache level firewall via another mod_perl plugin :)

Logs:
==> httpbl.log [Tue Nov 30 23:36:22 2010] httpBL: 113.22.131.111 (13) "127.48.15.1" "www.pgregg.com" "/projects/php/preg_find/preg_find.phps" "http://www.dslreports.com/forum/r19430990-PHP-link-generator" [HTTPBL:13]
[Tue Nov 30 23:36:23 2010] httpBL: 113.22.131.111 (13) "127.48.15.1" "www.pgregg.com" "/favicon.ico" "" [HTTPBL:13]
[Tue Nov 30 23:36:24 2010] httpBL: 113.22.131.111 (13) "127.48.15.1" "-" "-" "-" [HTTPBL:13]

==> error_log IP 113.22.131.111 is blocked

==> httpbl.log [Tue Nov 30 23:36:26 2010] httpBL: 113.22.131.111 (13) "127.48.15.1" "-" "-" "-" [HTTPBL:13]

Notice that none of these made it to the normal apache access_log. You also tend to get 3-4 simultaneous connections from clients so it is possible that you don't have time to implement the firewall since the other connections are running in parallel (and the firewall plugin happens right at the connection handling stage). However, here we can see the firewall kick in and catch the last one. This IP will now be firewalled for (13) days (the score) after which time this firewall will be removed (and can be recreated by the logging plugin if necessary).

Leave a comment

About this Entry

This page contains a single entry by Paul Gregg published on November 28, 2010 11:18 PM.

Vodafone UK + HTC Desire + Android 2.2 FroYo = Fail. was the previous entry in this blog.

Stage 2: http:BL with Apache2 mod_perl is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.