Re: favicon.ico and robots.txt

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



On Fri, 28 Aug 2009 08:54:02 -0700
Taproot <webmaster@xxxxxxxxxxxxxxxxxx> wrote:

> Robots.txt is a file that allows or denies robots from indexing or 
> crawling the site if they behave as they should.

It's a common misconception. Robots.txt does NOT allow or deny...
Robots.txt only SUGGESTs what they should crawl or not. It's up to
the crawler to respect the robots.txt file. 

The big ones like Google, Yahoo, Microsoft do follow the instruction
of the robots.txt file, but many, especially the one harvesting
emails, photos..., do not follow the instructions of the robots.txt.


-- 
Thanks
http://www.911networks.com
When the network has to work
_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos

[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]
  Powered by Linux