Re: Blocking crawling of CGIs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



There's no guarantee that crawlers will be polite and honor robots.txt directives; the search-engine ones probably do, but the spammers' ones definitely don't and in fact probably pay special attention to what's excluded. (I have a honeypot entry in my robots.txt designed to catch and then block the malicious robots.) OTOH, since the user-agent data is also only as reliable as the intent of whoever sets the crawler up, filtering based on that may not be much help either. I seem to recall having read somewhere that it's possible to configure Apache to recognize "executables" independent of the OS and file extensions and associations? If that's true, perhaps that might lead to some solution to your problem.

Mark

-------- Original Message  --------
Subject:  Blocking crawling of CGIs
From: Tony Rice (trice) <trice@xxxxxxxxx>
To: users@xxxxxxxxxxxxxxxx
Date: Tuesday, September 18, 2007 11:24:20 AM

We've had some instances where crawlers have stumbled onto a cgi script
which refers to itself and start pounding the server with requests to
that cgi.

There are so many CGI scripts on this server that I don't want to
maintain a huge robots.txt file.  Any suggestions on other techniques to
keep crawlers away from cgi scripts?  Check the browser with
BrowserMatch and then do something creative with "deny from env="?



---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
  "   from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx


[Index of Archives]     [Open SSH Users]     [Linux ACPI]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Squid]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux