On 01/04/11 12:09, Ed W wrote:
Hi
My thought was to investigate having the internet side proxy add etag
headers to all content based on some quality hash function. Then have
the (expensive) remote side proxy rewrite the request headers to always
use If-None-Match? The idea is that the bandwidth is cheap on internet
connected side, so it can refresh it's cache of the whole page, generate
a new hash, but still return a "not modified" response if the end result
is the same string of bytes. How much of that can I implement in Squid
3.x today..?
3.1.10+ will validate If-None-Match and ETag, but will not add them to
requests itself.
Thanks - can you expand on what it means to "validate" in this case?
I think you mean that if the content is cached with a given eTag then
requests for that content will be returned from cache if the request has
an appropriate If-None-Match - is this the case?
I mean Squid will produce 412 or 304 replies using those headers in the
HTTP/1.1 if-modified-since and if-none-match checking algorithms to
reduce bandwidth.
So you will not have to alter the receiving Squid to meet your needs.
Only the sending one. With modern browsers adding those headers on their
own you may not even have to add them.
Note, I realise this could lead to some side effects where the action of
visiting the web page itself causes some other side effect, however, I
think this is a manageable problem for this requirement?
Thanks for any pointers to ideas or other products that might help?
ICAP or eCAP would be the way to go here for quick results. Making a
plugin to do the ETag generation and alterations before sending off.
Understood.
So the remote (client) side proxy would need an eCAP plugin that would
modify the initial request to include an ETag. This would require some
ability to interrogate what we have in cache and generate/request the
ETag associated with what we have already - do you have a pointer to any
API/code that I would need to look at to do this?
I'm unsure sorry. Alex at The Measurement Factory has better info on
specific details of what the eCAP API can do.
Then on the internet side proxy we would do whatever we need to retrieve
the content, say fetch the asset. Then our eCap on that side would
generate a consistent ETag using our favourite hash function?
Yes. I'd start with that side of the link and see if the modern browsers
ETag support plays well and eliminates the need for the client-side eCAP
trouble.
The part I'm unsure how to implement would be examining what's in
squid's cache in order to generate an ETag based on what we have got (ie
for remote side)?
Me too.
You could also look at cutting bodies off 304 replies at the Internet
side to avoid the bandwidth expensive TCP_REFRESH_UNMODIFIED responses.
Hmm, yes that would be very sensible. Apart from via eCAP are there
other ways I might do that?
Not currently.
NP: if you want to go ahead and alter Squid code adding If-None-Match on
outbound requests is an open bug. As is proper ETag variant caching
support.
I don't know if I have the time/ability to hack on squid code? Is there
someone who might be interested on working on this for an affordable fee?
IIRC we have Dimitry with The Measurement Factory assisting with HTTP
compliance fixes. I'm sure sponsorship towards a specific fix will be
welcomed.
Thanks for the very helpful feedback. Note if there are any existing
ecap/icap modules I should look at then please educate me? (I'm
currently using "Ziproxy" and looking at moving the interesting bits to
a Squid ecap module. I have also used "Rabbit" proxy which is somewhat
similar)
The one public eCAP adapter we have bee notified about happens to be for
doing gzip. http://code.google.com/p/squid-ecap-gzip/
Amos
--
Please be using
Current Stable Squid 2.7.STABLE9 or 3.1.11
Beta testers wanted for 3.2.0.5