Re: Help recognizing bots?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, June 22, 2005 3:57 pm, Brian Dunning said:
> I'm using the following code in an effort to identify bots:
>
> $client = $_SERVER['HTTP_USER_AGENT'];
> if(!strpos($client, 'ooglebot') && !strpos($client, 'ahoo') && !strpos
> ($client, 'lurp') && !strpos($client, 'msnbot'))
> {
>      (Stuff that I do if it's not a bot)
> }
>
> But it doesn't seem to be catching a lot of bot action. Anyone have a
> better list of user agents? (I left off the first letter of some to
> avoid case conflicts.)

Check your logfiles and/or web stats.

The most common bots should be pretty apparent.

Here's a hack that might be useful to you:

1. Change .htaccess thusly:
<Files robots.txt>
  ForceType application/x-httpd-php
</Files>

2. Edit robots.txt:
<?php
  error_log("robot_detected: $_SERVER[HTTP_USER_AGENT]");
?>

Since only legitimate robots read robots.txt, that should quickly generate
a list of legimate bots visiting your site.

You could even insert it into a database with a unique key on the value,
ignoring the errors of duplicates, and then you'd have the data already
filtered down to uniques.  Be a bit slower than error_log, I should
think... Maybe.

Course, it won't help at all with the idiot illegitmate bots...

And this could be a bit too much for a real busy site...

Though you'd hope that the good bots (which read robots.txt) aren't
pounding you THAT hard...

-- 
Like Music?
http://l-i-e.com/artists.htm


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux