> -----Original Message----- > From: Norman Khine [mailto:norman@xxxxxxxxx] > Sent: Tuesday, October 03, 2006 3:17 PM > To: users@xxxxxxxxxxxxxxxx > Subject: [users@httpd] Block wget attempts from my site > > Hello, > > What is the best way to block someone from ripping/mirroring > stuff from my site > with wget? Is there an Apache way to do this, have seen it done with > .htaccess but perhaps there is a way to do this from Apache. > > mod-security, snort perhaps? How does this fit with > VirtualHosts and can these be specific per host? > > Any comments and advise much appreciated. As Nick points out, it would be nice if people didn't need these things, but sometimes you get some idiot who downloads a 10MB page of reference data every minute in order to screen-scrape one number that he thinks might change some time in the future. So you need to protect yourself... Start with the User-agent header (see http://httpd.apache.org/docs/2.2/mod/mod_setenvif.html#browsermatch etc.) eg, BrowserMatchNoCase ^wget restrictRobot Deny from env=restrictRobot (You can do pretty much the same thing in mod_rewrite) Of course, this can be easily spoofed so then you're in to trapping client IPs and blocking based on that. But then their on dial-up or ADSL and keep changing the IP, so you need to use heuristics... A good trap is a hidden URL (nothing visible to click on, but the href is in the HTML) that only a robot sees and hits. It calls a server-sided program that writes the client-IP to a file. Then, for each request, you check this file (RewriteCond and RewriteMap) and drop the request if from bad IP (RewriteRule ^/(.*) - [F]). This can become quite a sport... Rgds, Owen Boyle Disclaimer: Any disclaimer attached to this message may be ignored. > > Cheers > > Norma > > > > --------------------------------------------------------------------- > The official User-To-User support forum of the Apache HTTP > Server Project. > See <URL:http://httpd.apache.org/userslist.html> for more info. > To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx > " from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx > For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx > This message is for the named person's use only. It may contain confidential, proprietary or legally privileged information. No confidentiality or privilege is waived or lost by any mistransmission. If you receive this message in error, please notify the sender urgently and then immediately delete the message and any copies of it from your system. Please also immediately destroy any hardcopies of the message. You must not, directly or indirectly, use, disclose, distribute, print, or copy any part of this message if you are not the intended recipient. The sender's company reserves the right to monitor all e-mail communications through their networks. Any views expressed in this message are those of the individual sender, except where the message states otherwise and the sender is authorised to state them to be the views of the sender's company. --------------------------------------------------------------------- The official User-To-User support forum of the Apache HTTP Server Project. See <URL:http://httpd.apache.org/userslist.html> for more info. To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx " from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx