Search squid archive

Re: Caching identical items from a dynamic URL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/12/10 10:59, Volker-Yoblick, Adam wrote:
Greetings,

I've got a fairly unique problem that maybe someone can assist with.

I'm sending files to a machine through my cache, but part of the URL
is dynamic, even if the file is exactly the same. For example, the
lines in my access.log all look like this:

GET http://1.2.3.4/foo/<GUID>/bar/abc.txt

Where GUID is different for every single deploy, even if the file is
exactly the same. This is done by creating a virtual directory that
points to a fixed location, but the name of the virtual directory is
a GUID, and changes on every run. This system is already in place,
and cannot be changed.

I have found that the files are NEVER served from the cache when the
GUID is different, even if the file MD5 is exactly the same. Every
single fill is a cache miss, every time. (I've verified that I DO get
cache hits across multiple deploys when the GUID is the same)

I imagine this is because squid is using the full URL to determine
whether or not the file is cached, either by including it in the MD5
hash, or using it as the lookup, or something similar.

It is. That is how HTTP works.

You can work around such broken server software internally with storeurl_rewrite, but this does nothing to reduce the external bandwidth costs added unnecessarily by your nasty backend.

If the client software is capable of handing 30x redirects I recommend performing one from all those GUID paths back to the actual data URI:

  acl guidBounce urlpath_regex ^/foo/[^/]+/bar/abc.txt$
  deny_info http://1.2.3.4/foo/bar/abc.txt guidBounce
  http_access deny guidBounce

Amos
--
Please be using
  Current Stable Squid 2.7.STABLE9 or 3.1.9
  Beta testers wanted for 3.2.0.3


[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux