Re: Save squid cache?

Amos Jeffries <squid3@xxxxxxxxxxxxx> · Mon, 15 Nov 2010 00:55:45 +0000

On Sun, 14 Nov 2010 23:34:22 +0100, Luigi Monaco <monaco@xxxxxxxxxx>
wrote:
> Hi to all, 
> 
> I have a question concerning the usage of squid to dump the content of
the
> cache, keep a copy of the cache or block this site (and its bound
content)
> from being deleted from squidÂs cache. 

"bound content"?

> 
> There are some sites in the net that I would like to ensure that I can
> surf them in the future even if the site goes offline or gets deleted/
> modified. wget is not really useful for this since it does not interpret
js
> and may offer a different result than when surfing with the browser -
> robots.txt and similar nuisances.  
> 
> So, I would like to have a secured copy of the website I surfed. squid
> does this, but how do I secure the cached content? Am I missing
something
> in the manuals? 

Squid is a proxy. Its purpose is to supply a good up to date working copy
of each and every page. Things are already stored as long as possible to
achieve this. The refresh_pattern can be set to extend storage times for
some things. However there are groups of objects which are not storable and
objects which change frequently. Playing around with these will more often
than not break the pages.

To do this type of offline browsing properly you need to use your web
browsers "save for offline" settings (or whatever they call it).

Alternatively you could try and view these sites in the archive.org
wayback machinery available online.

Amos