Web Hosting Talk







View Full Version : my 10 cents with Referer Spam


scaturan
09-30-2005, 09:31 PM
hi folks,

Referer Spam has always been a problem on my server. for those familiar with the PixelPost publishing platform, versions prior to 1.4.2 has been cursed with the Referer Spam plague. once the bots gets a hold of your URL, you better gear up or your Apache logs will get filled with garbage faster than you can blink. worse of all, the problem is no longer limited to PixelPost sites. :)

basic requirements:

Apache 1.3.33
mod_security
UNIX-like OS (freebsd, linux, etc..)
full access to " httpd.conf" (the Apache configuration file)
access to invoke "apachectl"
basic understanding of Apache's SetEnvIf & CustomLog directives.
basic understanding of mod_security directives.

the goal:

to sanitize access_log per <virtualhost> block and pipe offending Referer output to a single log file for real-time or later analysis.

1st step:

the following directories must be writeable by the user Apache runs as.

/usr/local/etc/apache/logs/global/
/usr/local/etc/apache/logs/cheese/

substitute 67.81.25.74 with valid IP address specific to your setup.

2nd step:

set these globally, meaning outside of any <virtualhost> or <directory> blocks in httpd.conf

ErrorLog "/usr/local/etc/apache/logs/global/error_log"
CustomLog "/usr/local/etc/apache/logs/global/access_log" common env=!do_not_log
CustomLog "/usr/local/etc/apache/logs/global/412_log" lamerbouncer env=do_not_log

SetEnvIf Referer ".offendingword" do_not_log
SetEnvIf Referer "offendingword." do_not_log
SetEnvIf Referer "offendingword-" do_not_log
SetEnvIf Referer "-offendingword" do_not_log

SecFilterSelective "HTTP_REFERER" "offendingword."
SecFilterSelective "HTTP_REFERER" ".offendingword"
SecFilterSelective "HTTP_REFERER" "offendingword-"
SecFilterSelective "HTTP_REFERER" "-offendingword"

SetEnvIf Referer ".offendingdomain" do_not_log
SetEnvIf Referer "offendingdomain." do_not_log
SetEnvIf Referer "offendingdomain-" do_not_log
SetEnvIf Referer "-offendingdomain" do_not_log

SecFilterSelective "HTTP_REFERER" "offendingdomain."
SecFilterSelective "HTTP_REFERER" ".offendingdomain"
SecFilterSelective "HTTP_REFERER" "offendingdomain-"
SecFilterSelective "HTTP_REFERER" "-offendingdomain"

SecFilterDefaultAction "deny,,status:412""

3rd step:

have a subdomain setup like this:

UseCanonicalName On
NameVirtualHost 67.81.25.74:80
Listen 67.81.25.74:80

<VirtualHost 67.81.25.74:80>
ServerName cheese.doodles.tld
DocumentRoot "/usr/home/cheese/htdocs/"
ErrorLog "/usr/local/etc/apache/logs/cheese/error_log"
CustomLog "/usr/local/etc/apache/logs/cheese/access_log" common env=!do_not_log
CustomLog "/usr/local/etc/apache/logs/global/412_log" lamerbouncer env=do_not_log
</VirtualHost>


4th & final steps:

as root, type: apachectl configtest
if you get: Syntax OK
then as root, type: apachectl graceful

spawn a window for each tail session:

tail -f /usr/local/etc/apache/logs/global/412_log
tail -f /usr/local/etc/apache/logs/global/access_log

analysis:

without the SetEnvIf directives, by default, the Referer spam is piped into access_log. have you ever seen 25 different IP addresses trying to use the same
Referer at the same time? it's not pretty. :)

without the SecFilterSelective & SecFilterDefaultAction directives, you can't send back the appropriate response.

with all those components working together, you sanitize your access_log from Referer spam and pipe them into 412_log for later analysis.

if there's a particular filename that gets pounded a lot,
let's say cheese.doodles.tld/culprit.php
to do the trick, you can also use:

SetEnvIf Request_URI "culprit.php" do_not_log
SecFilterSelective "THE_REQUEST" "culprit.php"

i hope some of you might find it useful. feel free to post corrections if need be. =)

shashankw
10-03-2005, 02:22 PM
Looks good. proper application of mod_security ;)

scaturan
10-04-2005, 06:42 AM
Originally posted by shashankw
Looks good. proper application of mod_security ;)

thanks. new fancy hostnaming schemes popup daily. more filters to create while more Referer Spam slips through, more time wasted. :/

i get paid barely over the minimum wage to create filters to deal with the problem meanwhile the spammers get thousands and millions. heh, i think i'm going to retire soon from web hosting and just work at micky d's. flipping burgers, anyone? :)

PhilG
10-06-2005, 07:34 PM
Its a good tutorial scaturan. However, can I suggest that instead of sending them to a domain on your server redirect them back to there own site, causing them to suffer the pains of wasted cpu usage......

They will soon take you off your list ;-)

scaturan
10-06-2005, 09:19 PM
Originally posted by PhilG
Its a good tutorial scaturan. However, can I suggest that instead of sending them to a domain on your server redirect them back to there own site, causing them to suffer the pains of wasted cpu usage......

They will soon take you off your list ;-)

i wonder if they actually follow the URL redirect. afterall, all they're after for is to have the referring URL written on your log files.

it would be good to see mod_rewrite, mod_security and conditional logging tackle this problem in one shot, making it easy to create rulesets without being cryptic. basically, the goal for every match is to send a "412 precondition failed" http response, redirect them to the Referer URL they provided and pipe the Referer spam to a separate log file.

lamp
11-26-2005, 04:36 PM
instead of sending them to a domain on your server redirect them back to there own site

How would you do that?

Thanks.
Lamp

PhilG
11-26-2005, 05:16 PM
Im not sure on the exact code but it can be done easily with mod_rewrite.

lamp
11-28-2005, 11:12 AM
Im not sure on the exact code but it can be done easily with mod_rewrite.

Thanks. I actually figured out how to do the redirect with mod_security.

Lamp

linux-tech
12-07-2005, 08:33 PM
Great tutorial here;)
Keep in mind one thing though.
When you do something like this, you're slowing apache down considerably. If the word list / ip list is too long, you will end up adding a good bit of load to the server itself, which isn't good.

Not that this isn't a good idea or tutorial, but it's definitely something to keep in mind when using this.

CartHost
12-09-2005, 01:54 PM
heh, why don't use guys just use robots.txt :)

linux-tech
12-09-2005, 03:06 PM
heh, why don't use guys just use robots.txt :)
robots.txt has absolutely nothing to do with referrers. It has everything to do with robots and search engines, but not referrers and blocking those that cause problems.

CartHost
12-09-2005, 03:16 PM
hrmph, maybe I am thinking of some different kind of referral spam than what you are talking about.