9/09/2007

SpamBot Killers on the Rampage

"Am I afraid of high notes? Of course I am afraid! What sane man is not?” -- Pavarotti

If you run a blog or website and you are carefully watching your hits, more often than not you should be able to identify bots attempting to do all kinds of nasty, evil things at your site.

Many of these absolutely will not obey the Robots.txt file (although they may request it just so they can see if there is any material for nastiness there). They perpetrate all kinds of schemes, trying out SQL Injection attacks, attempting to post blog spam and trackbacks, and the like. Some of the stuff they try (and I have real-time database logging, so it's easy to amuse myself) is either hilarious or pitiful - depending on your frame of reference!

Fortunately, it is relatively easy to head these nasties off at the pass -- almost all this junk traffic emanates from IP addresses that don't change. So what I do is I have a BADIP list that's easy for me to edit, and in Global, in Application_PreRequestHandlerExecute, I have an "IsBannedIp" method that does a lookup on my list and grabs the request and kills it before it even gets to any Page handler. Of course if you wanted to get really nasty in return you could initiate the Ping of Death on them or some such foolishness, but I would not recommend it. The best thing to do is head them off at the pass. This won't stop the requests, because they are trolling Weblogs.com and other RPC services for "new stuff" to do their nasties on all the time. But, at least they don't get in.

Here is some sample code that does the trick -- all this is in global.asax:




// global--

public static string[] BannedIPs;
protected void LoadBannedIPs()
{
string strBanned = ConfigurationManager.AppSettings["bannedIPs"];
BannedIPs = strBanned.Split(';');
}


protected void Application_Start(object sender, EventArgs e)
{
LoadBannedIPs();
}


public static bool IsBannedIP( string strIP)
{
bool retval = false;
foreach(string s in BannedIPs )
{
if(strIP==s) retval = true;
}

return retval;
}

protected void Application_PreRequestHandlerExecute(object sender, EventArgs e)
{
if ( IsBannedIP(Request.UserHostAddress))
{
PAB.ExceptionHandler.ExceptionLogger.HandleException(
new Exception("Banned IP:" + Request.UserHostAddress.ToString() + ": " +
Request.RawUrl +":UA=" +Request.UserAgent) );
// this ought to fix them...
HttpContext.Current.Response.Redirect("http://www.yahoo.com");

}
}


The "PAB.ExceptionHandler" code you see there is just my custom logging facility into the database. Here's one of the baddies that's on my banned list: 170.224.8.126. If you do a google search on the IP address you will see it show up in a lot of logs that have identified it as a nastybot. You do want to be careful with stuff like this though. I once put the googlebot into my baddies list by mistake. If you log your hits with user-agent, IP Address, referer and other key information and keep it in a database table that's easy to view with an admin page, that makes things much easier. Especially, you want to be able to look at the entire request URL including querystring, since that's what will really tell you whether it's got good intentions or not.

This is kind of the next worst thing than outright plagiarism of your content. Just last week I emailed somebody who'd copied a blog post of mine and republished it verbatim -- minus any trace of attribution to the source. I told the guy, "look - if you are going to lift other people's content, at least take the time to massage it around and add some value, OK?"

Such is life on the Internet.