RE: [users@httpd] Block wget attempts from my site

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Norman Khine [mailto:norman@xxxxxxxxx] 
> Sent: Tuesday, October 03, 2006 3:17 PM
> To: users@xxxxxxxxxxxxxxxx
> Subject: [users@httpd] Block wget attempts from my site
> 
> Hello,
> 
> What is the best way to block someone from ripping/mirroring 
> stuff from my site
> with wget? Is there an Apache way to do this, have seen it done with
> .htaccess but perhaps there is a way to do this from Apache.
> 
> mod-security, snort perhaps? How does this fit with 
> VirtualHosts and can these be specific per host?
> 
> Any comments and advise much appreciated.

As Nick points out, it would be nice if people didn't need these things,
but sometimes you get some idiot who downloads a 10MB page of reference
data every minute in order to screen-scrape one number that he thinks
might change some time in the future. So you need to protect yourself...


Start with the User-agent header (see
http://httpd.apache.org/docs/2.2/mod/mod_setenvif.html#browsermatch
etc.)

eg,

BrowserMatchNoCase ^wget restrictRobot
Deny from env=restrictRobot

(You can do pretty much the same thing in mod_rewrite)

Of course, this can be easily spoofed so then you're in to trapping
client IPs and blocking based on that. But then their on dial-up or ADSL
and keep changing the IP, so you need to use heuristics...

A good trap is a hidden URL (nothing visible to click on, but the href
is in the HTML) that only a robot sees and hits. It calls a server-sided
program that writes the client-IP to a file. Then, for each request, you
check this file (RewriteCond and RewriteMap) and drop the request if
from bad IP (RewriteRule ^/(.*) - [F]).

This can become quite a sport...

Rgds,
Owen Boyle
Disclaimer: Any disclaimer attached to this message may be ignored. 
> 
> Cheers
> 
> Norma
> 
> 
> 
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP 
> Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
>    "   from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx
> For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx
>
 
 
This message is for the named person's use only. It may contain confidential, proprietary or legally privileged information. No confidentiality or privilege is waived or lost by any mistransmission. If you receive this message in error, please notify the sender urgently and then immediately delete the message and any copies of it from your system. Please also immediately destroy any hardcopies of the message. You must not, directly or indirectly, use, disclose, distribute, print, or copy any part of this message if you are not the intended recipient. The sender's company reserves the right to monitor all e-mail communications through their networks. Any views expressed in this message are those of the individual sender, except where the message states otherwise and the sender is authorised to state them to be the views of the sender's company.

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
   "   from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx



[Index of Archives]     [Open SSH Users]     [Linux ACPI]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Squid]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux