Search squid archive

RE: Caching identical items from a dynamic URL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 13 Dec 2010 11:17:16 -0800, "Volker-Yoblick, Adam" wrote:
> I'm having some issues moving down to squid 2.7 (which are detailed in
> another thread). What are the chances of the storeurl stuff making it
into
> squid 3.x ?

Very good. (Just don't ask which 3.x series).
Someone is working on it now, so if that continues sometime next year.

Amos

> 
> -----Original Message-----
> From: Volker-Yoblick, Adam
> 
> Ahhh nevermind, just saw that it's only available in squid 2.7. =(
> 
> -----Original Message-----
> From: Volker-Yoblick, Adam
> 
> Also,
> 
> I'm trying to use storeurl_rewrite_program to parse values from the URL
on
> the fly, but when I run squid -k parse on my config file, I get this:
> 
> 2010/12/13 10:44:44| cache_cf.cc(363) parseOneConfigFile: squid.conf:53
> unrecognized: 'storeurl_rewrite_program'
> 2010/12/13 10:44:44| cache_cf.cc(363) parseOneConfigFile: squid.conf:65
> unrecognized: 'storeurl_access'
> 2010/12/13 10:44:44| cache_cf.cc(363) parseOneConfigFile: squid.conf:66
> unrecognized: 'storeurl_access'
> 
> 
> Here's the relevant part of my config:
> 
> storeurl_rewrite_program /opt/ActivePython-2.6/bin/python
> /usr/local/squid/etc/GetPathFromUrl.py
> acl store_rewrite_list url_regex ^http://(.*?)/Foo/(.*?) storeurl_access
> allow store_rewrite_list storeurl_access deny all
> 
> 
> Do I need to do something special to be able to use these options in the
> config file?
> 
> -----Original Message-----
> From: Volker-Yoblick, Adam
> 
> Thanks for the info.
> 
> One more question about this:
> 
> If I use store_rewrite to trim the GUID from the path and only use the
> relative path to the file (data/foo.txt, for example), and I do another
> deployment of the same file path (but a different MD5), will squid
actually
> store two copies of this file, or will it dirty the cache copy right
away
> because it uses the URL in the lookup?
> 
> 
> -----Original Message-----
> From: Amos Jeffries
> 
> On 11/12/10 10:59, Volker-Yoblick, Adam wrote:
>> Greetings,
>>
>> I've got a fairly unique problem that maybe someone can assist with.
>>
>> I'm sending files to a machine through my cache, but part of the URL 
>> is dynamic, even if the file is exactly the same. For example, the 
>> lines in my access.log all look like this:
>>
>> GET http://1.2.3.4/foo/<GUID>/bar/abc.txt
>>
>> Where GUID is different for every single deploy, even if the file is 
>> exactly the same. This is done by creating a virtual directory that 
>> points to a fixed location, but the name of the virtual directory is a 
>> GUID, and changes on every run. This system is already in place, and 
>> cannot be changed.
>>
>> I have found that the files are NEVER served from the cache when the 
>> GUID is different, even if the file MD5 is exactly the same. Every 
>> single fill is a cache miss, every time. (I've verified that I DO get 
>> cache hits across multiple deploys when the GUID is the same)
>>
>> I imagine this is because squid is using the full URL to determine 
>> whether or not the file is cached, either by including it in the MD5 
>> hash, or using it as the lookup, or something similar.
> 
> It is. That is how HTTP works.
> 
> You can work around such broken server software internally with
> storeurl_rewrite, but this does nothing to reduce the external bandwidth
> costs added unnecessarily by your nasty backend.
> 
> If the client software is capable of handing 30x redirects I recommend
> performing one from all those GUID paths back to the actual data URI:
> 
>    acl guidBounce urlpath_regex ^/foo/[^/]+/bar/abc.txt$
>    deny_info http://1.2.3.4/foo/bar/abc.txt guidBounce
>    http_access deny guidBounce
> 
> Amos



[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux