APLawrence.com -  Resources for Unix and Linux Systems, Bloggers and the self-employed

Script to block DOS attacks

People steal content. If you run a website, you almost certainly know that: if you have a good page about almost any subject, someone is sure to copy it to their own site.

Sometimes people just ignorantly think that's acceptable. It isn't: when you create a webpage, you automatically have the right to prevent unauthorized use of that work. Unfortunately, it's not all that easy to enforce those rights unless that particular page is producing significant income. If the thief uses Google Adsense, you can usually get Google to convince them they should remove the content, and even without that you can ask Google to remove their page from search results, but doing anything else usually isn't worth the trouble.

That doesn't mean you shouldn't try. Often an honest site owner is happy to take down unauthorized pages that they have obtained from the person who actually did the theft. There is certainly no harm in writing and asking; it really can be that easy.

Not always, though, and a case of this caused me to write this DOS (Denial of Sevice) blocking script.

I recently had a call from a customer running an old Centos system. He had experienced a software RAID failure and asked me for advice.

It's been almost a decade since I last touched a Centos system of that vintage. I knew that I have an article about that at "Rebuilding failed Linux software RAID", but I wanted to see what else I could find in case my old article wasn't enough. So, I did a bit of Googling.

Imagine my surprise when one of the first things that turned up was a near mirror copy of my article. Oh, the thief had left out a few paragraphs, but there is no doubt he copied it from me. I made a mental note to come back to that later and continued trying to find resources for my customer.

After that crisis had passed, I returned to this, found the contact info for the website owner and sent a polite letter requesting removal:


Your http://xyz .. xx/knowledgebase.php?action=displayarticle&id=11
is copied from my ;http://aplawrence.com/Linux/rebuildraid.htmlwithout permission. Please remove it ASAP.

I recieved a very quick reply:


The only similarities that are there is the same commands, So unless you own "Liinux" Witch is impossible because its open source then this isn't a valid Copyright violation, If you insist that this is copied from your website please send a court ordered takedown notice of this information.

Funny guy. Good speller, too. Without giving it another thought, I submitted the proper forms asking Google to remove his page from their index. If that goes through as it should, I'd then contact whoever owns the site he uses for hosting and present a DMCA.

I warned him that I'd be searching his site for other violations and advising everyone I know that he is a likely content thief and that they should check for their own pages. I expected to hear nothing more from him, but he wasn't done(again, presented as written):


Infact your probally copied our knowledge base, Let me send you a DMCA, Also Ill get our Lawyer to send you a court order through to remove from your site as I see multiple copyright issues. with you and your failed attempt.

I wrote back wishing him luck.

Minutes later, I noticed a DOS attack on my website:

xx.xx.183.83 - - [01/Jun/2013:11:14:40 -0400] "-" 408 0 "-" "-"
xx.xx.183.83 - - [01/Jun/2013:11:14:40 -0400] "-" 408 0 "-" "-"
xx.xx.183.83 - - [01/Jun/2013:11:14:40 -0400] "-" 408 0 "-" "-"
xx.xx.183.83 - - [01/Jun/2013:11:14:40 -0400] "-" 408 0 "-" "-"
 

There were almost 7,000 lines like that in my website in a time span of 20 minutes and it WAS slowing my server down. I checked the IP and was not surprised to find that it came from the same site that had stolen the content. I quickly added a DROP to iptables for that IP and my site returned to normal health.

I also looked up the owner of the netblock and sent a complaint to their "abuse" address. They investigated, but he had stopped trying by then and of course denied ever having done such a thing!

I realized that this jerk might very well try again from some other IP, so I wrote a simple script that looks for that sort of activity. After I wrote it, I realized that it will also stop some careless people from doing massive and fast parallel website downloads, which is also activity I wish to discourage.

I do run Fail2ban here, but it didn't catch it. I don't know why and it's quicker for me to write this script than to figure out why!

Here is the script:

#!/usr/bin/perl
open(I,"/root/logs/access.log");
@f=`/sbin/iptables --list -n | grep DROP`;
foreach(@f) {
 @s=split /\s+/;
 $dropped{$s[3]}=$s[3];
}
while (<I>) {
@s=split /\s+/;
next if $dropped{$s[0]};
$normal=0;
$normal=1 if /GET/;
$normal=1 if /POST/;
# we've already dropped this ip
$m="$s[0] $s[3]";
if ($m eq $last) {
  $count{$m}++;
  $count{$m}+= .3 if not $normal;
  $count{$m}+= .33 if /POST/;
  if ($count{$m} > 30) {
    print "$m $count{$m)\n";
    system("/sbin/iptables -A INPUT -s $s[0] -j DROP");
    $dropped{$s[0]}=$s[0];
  }
}
$last=$m;
}

Notice that "$normal" flag? That's to add a bit extra when people are doing HEAD requests or anything that isn't GET or POST. POSTS get an extra bump,also, because mine is not a website that should get a lot of repeated POST activity.

Some people don't like to allow anything but GET and POST (Google for "apache disable head request" to see discussions). However, HEAD, which once was used mostly for checking links, has taken on new uses in the age of social media. See What happens when you post a link on Twitter?, for example. That reference to "unwindfetchor" has to do with Gnip.com, one of several companies apt to be issuing HEAD requests for social media tracking.

So, by weighting my scoring, I avoid shunning this sort of activity entirely, but can still control abuse.

The choice of 30 quick accesses to block is arbitrary. I think that represents fairly aggressive activity. I'll run that from cron and see if it needs tweaking, but it seems to be about right. Reasonable activity is unaffected, unreasonable gets blocked.

Though I do have to note that Microsoft Bing spiders far too agressively. Google does far less; Bing comes at me very quickly and takes far more. They are an annoyance, frankly, but some fools do use them for search, so I can't block them.

The question arises as to when to unblock. You don't want to fill up iptables with hundreds of DROPS, and the offenders often change ip addresses, which means an innocent person may acquire their IP later. This code ignores that. I examine blocked ip's anyway because I also block particularly persistent comment spammers; when they age a bit, I unblock them.

Isn't it annoying that we even have to think about this kind of junk?



Got something to add? Send me email.





(OLDER)    <- More Stuff -> (NEWER)    (NEWEST)   

Printer Friendly Version

-> -> Script to block DOS attacks


2 comments



Increase ad revenue 50-250% with Ezoic


More Articles by

Find me on Google+

© Anthony Lawrence







Mon Jun 3 02:58:56 2013: 12101   georgelarson

gravatar


Very wonderful post; thanks for sharing! Here's hoping that jerkface is done with being a jerkface.





Mon Jun 3 20:27:17 2013: 12104   TonyLawrence

gravatar


Google investigated and took him down:

Hello,

Thanks for reaching out to us.

In accordance with the Digital Millennium Copyright Act, we have completed processing your infringement notice. The following URLs will be removed from Google’s search results in a few hours:

ht..tp://xyzx...xyz/client/knowledgebase.php?action=displayarticle&id=11

Please let us know if we can assist you further.

If you would like to file additional requests, we ask that you contact us by using the online forms at: www.google.com/support/go/legal as we do not accept add-on requests.

Regards,

The Google Team

------------------------
Kerio Connect Mailserver

Kerio Samepage

Kerio Control Firewall

Have you tried Searching this site?

Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us





Your computer needn't be the first thing your see in the morning and the last thing you see at night. (Simon Mainwaring)

UNIX does not allow path names to be prefixed by a drive name or number; that would be precisely the kind of device dependence that operating systems ought to eliminate. (Andrew S. Tanenbaum)












This post tagged: