Search squid archive

Re: How to filter response in squid-3.1.x?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 20/10/11 20:11, Kaiwang Chen wrote:
2011/10/20 Amos Jeffries<squid3@xxxxxxxxxxxxx>:
On Thu, 20 Oct 2011 00:39:32 +0800, Kaiwang Chen wrote:

2011/10/19 Amos Jeffries:

On Wed, 19 Oct 2011 05:15:22 +0800, Kaiwang Chen wrote:

<snip>

To only change the HTTP headers, there are some tricks you can do with
the
"must-revalidate" and/or "proxy-revalidate" cache control. These controls
causes the surrogate to contact the origin web server on every request.
The
origin can send back new headers on a 304 not-modified response. Meaning
the
headers get changed per-response, but the cached body gets sent only when
actually changed. Retaining most of the bandwidth and performance
benefits
of caching.

So, the possible solution could be injecting a "Cache-Control:
must-revalidate" header by some eCap reqmod_precache service, then
Squid will revalidate the response on every request carrying new
request headers, then the origin server has its chance to set new
response headers? A little counter-intuitive workaround for class 4
adaption. Not perfect, since revalidate only occurs only when the
response is stale,

That would be 'normal' revalidation operation. Which is why the control
exists and is called must-revalidate. To override the normal operation and
force revalidation on every request.

You could set it in a filter module altering the headers. And repeat the
setup on every proxy surrogate as your expand the CDN. It is far easier to
send it from the origin which is designed to do set these controls very
efficiently and scales perfectly.


So which header forces revalidation on every request, is it cache
response directive "Cache-Control: max-age=0, must-revalidate"?

Referring to this section of rfc2616,
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9,
'Cache-Control: must-revalidate' is a cache-response-directive, and as
cited:

     When the must-revalidate directive is present in a response
received by a cache, that cache MUST NOT use the entry after it
becomes stale to respond to a subsequent request without first
revalidating it with the origin server. ... In all circumstances an
HTTP/1.1 cache MUST obey the must-revalidate directive; in particular,
if the cache cannot reach the origin server for any reason, it MUST
generate a 504 (Gateway Timeout) response.

Well, I have some trouble to understand the following transaction,
where the cached response was stale from the client's perspective,
Squid really did revalidation and got 304(success, isn't it?),
however, the client still got "Revalidation failed" warning... The
Squid was configured

refresh_pattern .               0       20%     4320

which guessed out a relatively long fresh period. And when origin
server specifies "Cache-Control: max-age=0, must-revalidate", Squid
revalidates on each request and warns the client with "Revalidation
failed".

//============= client ->  surrogate
GET /cgi-bin/index.php HTTP/1.0
User-Agent: Wget/1.10.2 (Red Hat modified)
Accept: */*
Host: my.example.com
Connection: Keep-Alive
Cache-Control: max-age=10

//============= surrogate ->  origin server
GET /cgi-bin/index.php HTTP/1.1
If-Modified-Since: Wed, 19 Oct 2011 11:00:00 GMT
User-Agent: Wget/1.10.2 (Red Hat modified)
Accept: */*
Host: my.example.com
Via: 1.0 s0.example.com (squid/3.1.16)
X-Forwarded-For: x.x.x.x
Cache-Control: max-age=10
Connection: keep-alive

//============= origin server ->  surrogate
HTTP/1.1 304 Not Modified
Date: Thu, 20 Oct 2011 05:26:28 GMT
Server: Apache/2.2.3 (CentOS)
Connection: close
Cache-Control: must-revalidate

//============= Surrogate ->  Client
HTTP/1.0 200 OK
X-Powered-By: PHP/5.1.6
Last-Modified: Wed, 19 Oct 2011 11:00:00 GMT
Content-Length: 66
Content-Type: text/html; charset=UTF-8
Date: Thu, 20 Oct 2011 05:26:28 GMT
Server: Apache/2.2.3 (CentOS)
Cache-Control: must-revalidate
Warning: 110 squid/3.1.16 "Response is stale"
Warning: 111 squid/3.1.16 "Revalidation failed"

These warnings being present is a bug. The rest of the result is correct.

The max-age=10 requirement ("nothing more than 10 seconds stale") forces it to revalidate since the object it has is around 24hrs old.

 The origins must-revalidate also forces revalidation.

The reply to the client should not have the warnings, since the origin has indicated that the object is currently valid (304).

refresh_pattern is not relevant. Since there is a Cache-Control header present. No estimations need to be made.


while what I am looking for is adapting every
response before it leaves Squid for the client. 'Cache-Control:
max-age=0' will force revalidation every response, though.

Otherwise known as "force reload".
Forces full erasure and new a full new fetch on every request. Not
revalidation.

Let's make it clear.. Is the 'Cache-Control: max-age=0' as request
header that force full erasure,

No. From the client it simply means revalidate immediately. AND pass on the max-age=0 to origin.

Erasure is a side effect of Squid receiving a 200 reply from the revalidation check. Nothing more. It is very likely to change when multiple variants are cached.


while 'Cache-Control: max-age=0' as
response header simply marks pre-expiration and Squid fells free to
store a pre-expired response and validates it later when serving next
request?

That is correct.
You will just need to check the Squid release is one of the recent ones which cache pre-expired content. Some earlier ones did not.



Looks like a "Surrogate-Control: max-age=0, revalidate" header
eliminates the need of a filter module in this case? Not sure about
the 'Surrogate-Control: revalidate", since it is not listed in Edge
Architecture Specification, http://www.w3.org/TR/edge-arch, referred
by http://wiki.squid-cache.org/Features/Surrogate.

Squid ignores unknown ones presently. If you need, it can be extended.
Although, if you go with max-age=0, revalidate is redundant.


I also chance read ESI which really resembles class 4 adaption with
limited capability that only modifies response body. Looks like it is
incapable of doing custom complex calculation. So Squid does not
support class 4 adaption in general? Any other alternative?

ESI, yes is good for personalization of the body. It does not exactly do
calculations. It does widget insertion in to pages for personalization at
the gateway machine. Allowing caching of the page template and widgets
separately within a CDN.

You were taking about personalizing Cookies etc, which are not part of the
body content.

Sure. A side question: when a surrogate fetches ESI widget, will it
carry request headers from client(assuming widget is in same domain to
that of the page) and inject response headers before the page is
served to client?


I don't think so. It is just a form of body/object macro-expansion. With some fancy bits for determining which widget to insert.


So Squid without the adapter will cache one copy of responses in only
one encoding.

Yes.

> Will "Vary:Accept-Encoding" request header enable
multiply copies?

No. It tells Squid there are multiple variants with the same URL, and to check the Accept-Encoding header against the one stored already when deciding if it is a HIT.


Amos
--
Please be using
  Current Stable Squid 2.7.STABLE9 or 3.1.16
  Beta testers wanted for 3.2.0.13


[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux