Web bots can ignore the robots.txt file, most scrapers would. On Mar 13, 2013 4:59 PM, "Jen Rasmussen" <jen@xxxxxxxxxxxxxxxx> wrote: > -----Original Message----- > From: Dale H. Cook [mailto:radiotest@xxxxxxxxxxxxxxxxxx] > Sent: Wednesday, March 13, 2013 3:38 PM > To: php-general@xxxxxxxxxxxxx > Subject: Accessing Files Outside the Web Root > > Let me preface my question by noting that I am virtually a PHP novice. > Although I am a long-time webmaster, and have used PHP for some years to > give visitors access to information in my SQL database, this is my first > attempt to use it for another purpose. I have browsed the mailing list > archives and have searched online but have not yet succeeded in teaching > myself how to do what I want to do. This need not provoke a lengthy > discussion or involve extensive hand-holding - if someone can point to an > appropriate code sample or online tutorial that might do the trick. > > I am the author of a number of PDF files that serve as genealogical > reference works. My problem is that there are a number of sites which are > posing as search engines and which display my PDF files in their entirety > on > their own sites. These pirate sites are not simply opening a window that > displays my files as they appear on my site. They are using Google Docs to > display copies of my files that are cached or stored elsewhere online. The > proof of that is that I can modify one of my files and upload it to my > site. > The file, as seen on my site, immediately displays the modification. The > same file, as displayed on the pirate sites, is unmodified and may remain > unmodified for weeks. > > It is obvious that my files, which are stored under public_html, are being > spidered and then stored or cached. This displeases me greatly. I want my > files, some of which have cost an enormous amount of work over many years, > to be available only on my site. Legitimate search engines, such as Google, > may display a snippet, but they do not display the entire file - they link > to my site so the visitor can get the file from me. > > A little study has indicated to me that if I store those files in a folder > outside the web root and use PHP to provide access they will not be > spidered. Writing a PHP script to provide access to the files in that > folder > is what I need help with. I have experimented with a number of code samples > but have not been able to make things work. Could any of you point to code > samples or tutorials that might help me? Remember that, aside from the code > I have written to handle my SQL database I am a PHP novice. > > Dale H. Cook, Member, NEHGS and MA Society of Mayflower Descendants; > Plymouth Co. MA Coordinator for the USGenWeb Project Administrator of > http://plymouthcolony.net > > > -- > PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: > http://www.php.net/unsub.php > > > Have you tried keeping all of your documents in one directory and blocking > that directory via a robots.txt file? > > Jen > > > > > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > >