Re: Denying write access to the cache

Amos Jeffries <squid3@xxxxxxxxxxxxx> · Sat, 24 Mar 2007 12:23:30 +1200

Guillaume Smet wrote:
On 3/23/07, Amos Jeffries <squid3@xxxxxxxxxxxxx> wrote:
Looks like a case for something like this that prevents the group
'robots' from retrieving data not already in the cache:

acl robots <....>
always_direct deny robots

No, that's not what I want. It's not a problem for us that robots
index all the content of our website. I just want them to not put
garbage into our cache.
So they should be able to access every page of the site, using cache
or not, but they shouldn't be able to put the generated pages in the
cache so that they don't pollute the cache.

Still, I would pose you a question:
   if people find and visit your page by going to a search engine how
can they find useful pages that nobody else has visited recently??

I agree. That's why it's not what I'm asking for :).

Thanks for your help.

--
Guillaume

ah, now I understand.

This is a problem for your web server configuration then. Your cache and 
others around the world can be expected to cache any content that they 
are allowed to.
The best way to prevent this content being cached is for the originating 
web server to mark it as non-cachable using "Pragma: no-cache" and 
"Cache-Control: no-cache"

You can find info on them here
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.32

Amos