Re: User-agent/search engine spider class

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

On 12/05/2003 03:25 AM, Derek Scruggs wrote:
> I'm looking for a class to help me log data about user agents and search
> engine spiders. I came across PHPClientSniffer
> (http://www.phpclasses.org/browse.html/package/81.html), which looks good
> for things like detecting Javascript support. I'll probably incorporate some
> of this, but my primary concern at the moment is detecting wheter a user
> agent is probably a spider. 
> 
> There are dozens of known spiders. Ideally the class works against a csv or,
> even better, an XML file of known spiders that I can update as new ones
> become known. In a perfect world, someone (me?) hosts this XML file on a
> public web server so users of this class can refresh it once a week or so.
> 
> On a related note, I'm looking for a class that parses the referer string
> for common search engines. For example, Google referer strings usually look
> something like this (in English):
> 
> http://www.google.com/search?hl=en&ie=UTF-8&oe=UTF-8&q=some+search+term
> 
> It would be awesome if there were a class that allowed me to do something
> like this:
> 
> $referer=&new Referer($_SERVER['HTTP_REFERER']);
> 
> If($referer->is_search_engine()) {
> 	//return array of search words
> 	$searchPhrase=($referer->get_searchTerms());
> 
> 	//return language of search engine
> 	$lang=$refer->get_language();
> }
> 
> And on the spider side:
> 
> $spider=&new Spider($_SERVER['HTTP_USER_AGENT']);
> If($spider->is_known_spider()) {
> 	//return common name of spider (e.g. "Google" or "Yahoo")
> 	$name=$spider->get_commonName();
> 	//return whether this spider is known to be a spambot
> 	$evil=$spider->is_spambot();
> 
> }
> 
> I'd write this myself, but the need isn't critical at this point. A "nice to
> have" instead of a "nead to have."

I have this partly implemented in a class of the PHP Classes site code 
itself. Once I have more time I will isolate the relevant parts to make 
it useful to others.


-- 

Regards,
Manuel Lemos

Free ready to use OOP components written in PHP
http://www.phpclasses.org/


Look here for Free PHP Classes of objects:
http://phpclasses.UpperDesign.com/
To unsubscribe from this group, send an email to:
php-objects-unsubscribe@egroups.com

 

Yahoo! Groups Links

To visit your group on the web, go to:
 http://groups.yahoo.com/group/php-objects/

To unsubscribe from this group, send an email to:
 php-objects-unsubscribe@yahoogroups.com

Your use of Yahoo! Groups is subject to:
 http://docs.yahoo.com/info/terms/ 



[Index of Archives]     [PHP Home]     [PHP Users]     [PHP Soap]     [Kernel Newbies]     [Yosemite]     [Yosemite Campsites]

  Powered by Linux