Re: User-agent/search engine spider class

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Derek,

have a look:

  http://fantomaster.com/fasvsspy01.html

Perhaps it fits your needs.

Greetinx,
  Mike

__________________________________________

 Suchtreffer AG
 Bleicherstr. 20
 D-78467 Konstanz
 Germany

 fon:       +49-(0)7133-205551
 fax:       +49-(0)7531-89207-13

 e-mail:   mru@suchtreffer.de
 internet: http://www.suchtreffer.de/
__________________________________________

----- Original Message ----- 
From: "Derek Scruggs" <derek@creative-mail.com>
To: <php-objects@yahoogroups.com>
Sent: Friday, December 05, 2003 6:25 AM
Subject:  User-agent/search engine spider class


> Hi All,
> 
> I'm looking for a class to help me log data about user agents and search
> engine spiders. I came across PHPClientSniffer
> (http://www.phpclasses.org/browse.html/package/81.html), which looks good
> for things like detecting Javascript support. I'll probably incorporate some
> of this, but my primary concern at the moment is detecting wheter a user
> agent is probably a spider. 
> 
> There are dozens of known spiders. Ideally the class works against a csv or,
> even better, an XML file of known spiders that I can update as new ones
> become known. In a perfect world, someone (me?) hosts this XML file on a
> public web server so users of this class can refresh it once a week or so.
> 
> On a related note, I'm looking for a class that parses the referer string
> for common search engines. For example, Google referer strings usually look
> something like this (in English):
> 
> http://www.google.com/search?hl=en&ie=UTF-8&oe=UTF-8&q=some+search+term
> 
> It would be awesome if there were a class that allowed me to do something
> like this:
> 
> $referer=&new Referer($_SERVER['HTTP_REFERER']);
> 
> If($referer->is_search_engine()) {
> //return array of search words
> $searchPhrase=($referer->get_searchTerms());
> 
> //return language of search engine
> $lang=$refer->get_language();
> }
> 
> And on the spider side:
> 
> $spider=&new Spider($_SERVER['HTTP_USER_AGENT']);
> If($spider->is_known_spider()) {
> //return common name of spider (e.g. "Google" or "Yahoo")
> $name=$spider->get_commonName();
> //return whether this spider is known to be a spambot
> $evil=$spider->is_spambot();
> 
> }
> 
> I'd write this myself, but the need isn't critical at this point. A "nice to
> have" instead of a "nead to have."
> 
> Thanks!
> 
> -Derek
> 
> 
> 
> Look here for Free PHP Classes of objects:
> http://phpclasses.UpperDesign.com/
> To unsubscribe from this group, send an email to:
> php-objects-unsubscribe@egroups.com
> 
>  
> 
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 
> 
> 


------------------------ Yahoo! Groups Sponsor ---------------------~-->
Buy Ink Cartridges or Refill Kits for your HP, Epson, Canon or Lexmark
Printer at MyInks.com. Free s/h on orders $50 or more to the US & Canada.
http://www.c1tracking.com/l.asp?cid=5511
http://us.click.yahoo.com/mOAaAA/3exGAA/qnsNAA/saFolB/TM
---------------------------------------------------------------------~->

Look here for Free PHP Classes of objects:
http://phpclasses.UpperDesign.com/
To unsubscribe from this group, send an email to:
php-objects-unsubscribe@egroups.com

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



[Index of Archives]     [PHP Home]     [PHP Users]     [PHP Soap]     [Kernel Newbies]     [Yosemite]     [Yosemite Campsites]

  Powered by Linux