SpamBot Killers on the Rampage

"Am I afraid of high notes? Of course I am afraid! What sane man is not?” -- Pavarotti

If you run a blog or website and you are carefully watching your hits, more often than not you should be able to identify bots attempting to do all kinds of nasty, evil things at your site.

Many of these absolutely will not obey the Robots.txt file (although they may request it just so they can see if there is any material for nastiness there). They perpetrate all kinds of schemes, trying out SQL Injection attacks, attempting to post blog spam and trackbacks, and the like. Some of the stuff they try (and I have real-time database logging, so it's easy to amuse myself) is either hilarious or pitiful - depending on your frame of reference!

Fortunately, it is relatively easy to head these nasties off at the pass -- almost all this junk traffic emanates from IP addresses that don't change. So what I do is I have a BADIP list that's easy for me to edit, and in Global, in Application_PreRequestHandlerExecute, I have an "IsBannedIp" method that does a lookup on my list and grabs the request and kills it before it even gets to any Page handler. Of course if you wanted to get really nasty in return you could initiate the Ping of Death on them or some such foolishness, but I would not recommend it. The best thing to do is head them off at the pass. This won't stop the requests, because they are trolling Weblogs.com and other RPC services for "new stuff" to do their nasties on all the time. But, at least they don't get in.

Here is some sample code that does the trick -- all this is in global.asax:




// global--

public static string[] BannedIPs;
protected void LoadBannedIPs()
{
string strBanned = ConfigurationManager.AppSettings["bannedIPs"];
BannedIPs = strBanned.Split(';');
}


protected void Application_Start(object sender, EventArgs e)
{
LoadBannedIPs();
}


public static bool IsBannedIP( string strIP)
{
bool retval = false;
foreach(string s in BannedIPs )
{
if(strIP==s) retval = true;
}

return retval;
}

protected void Application_PreRequestHandlerExecute(object sender, EventArgs e)
{
if ( IsBannedIP(Request.UserHostAddress))
{
PAB.ExceptionHandler.ExceptionLogger.HandleException(
new Exception("Banned IP:" + Request.UserHostAddress.ToString() + ": " +
Request.RawUrl +":UA=" +Request.UserAgent) );
// this ought to fix them...
HttpContext.Current.Response.Redirect("http://www.yahoo.com");

}
}


The "PAB.ExceptionHandler" code you see there is just my custom logging facility into the database. Here's one of the baddies that's on my banned list: 170.224.8.126. If you do a google search on the IP address you will see it show up in a lot of logs that have identified it as a nastybot. You do want to be careful with stuff like this though. I once put the googlebot into my baddies list by mistake. If you log your hits with user-agent, IP Address, referer and other key information and keep it in a database table that's easy to view with an admin page, that makes things much easier. Especially, you want to be able to look at the entire request URL including querystring, since that's what will really tell you whether it's got good intentions or not.

This is kind of the next worst thing than outright plagiarism of your content. Just last week I emailed somebody who'd copied a blog post of mine and republished it verbatim -- minus any trace of attribution to the source. I told the guy, "look - if you are going to lift other people's content, at least take the time to massage it around and add some value, OK?"

Such is life on the Internet.

Comments

  1. Anonymous4:24 PM

    Good idea.

    One of my biggest concerns with a routine like that is performance, since every page will hit it.

    With that in mind, I would get rid of the array manipualtions and searching.

    Load the string from the web.config file, and keep it as a string. Then test it using:

    if (strBanned.Contains(Request.UserHostAddress)) {...}

    Better on memory, better performance, simpler coding. No?

    ReplyDelete
  2. You could do that, or you could load it into a Hashtable which might be even faster. It's just sample code. In the big scheme of things, unless you have a very large list, its not a big deal.

    ReplyDelete
  3. Anonymous9:02 PM

    ..or a lot of traffic! (Which I do.)

    I understand about the sample code thing, no biggie. Some people just tend to cut & paste these things, so they might want to know about a potential issue before they do that.

    I try to avoid for/each loops in high-traffic places, although it is more of a VB issue than a C# issue currently.

    ReplyDelete
  4. probably then a Generic List of type string and using the Contains method might be the fastest. I'd be interested in seeing some timing reports. Bottom line though, is that you have to process each request if you are going to do this, whether you do so at the IIS level or when the request goes to ASP.NET.

    ReplyDelete

Post a Comment

Popular posts from this blog

Some observations on Script Callbacks, "AJAX", "ATLAS" "AHAB" and where it's all going.

IE7 - Vista: "Internet Explorer has stopped Working"

FIREFOX / IE Word-Wrap, Word-Break, TABLES FIX

System.Web.Caching.Cache, HttpRuntime.Cache, and IIS Recycles

FIX: Requested Registry Access is not allowed (Visual Studio 2008)