2011/10/20 Amos Jeffries <squid3@xxxxxxxxxxxxx>: > On 20/10/11 20:11, Kaiwang Chen wrote: >> >> 2011/10/20 Amos Jeffries<squid3@xxxxxxxxxxxxx>: >>> >>> On Thu, 20 Oct 2011 00:39:32 +0800, Kaiwang Chen wrote: >>>> >>>> 2011/10/19 Amos Jeffries: >>>>> >>>>> On Wed, 19 Oct 2011 05:15:22 +0800, Kaiwang Chen wrote: >>> >>> <snip> >>>>> >>>>> To only change the HTTP headers, there are some tricks you can do with >>>>> the >>>>> "must-revalidate" and/or "proxy-revalidate" cache control. These >>>>> controls >>>>> causes the surrogate to contact the origin web server on every request. >>>>> The >>>>> origin can send back new headers on a 304 not-modified response. >>>>> Meaning >>>>> the >>>>> headers get changed per-response, but the cached body gets sent only >>>>> when >>>>> actually changed. Retaining most of the bandwidth and performance >>>>> benefits >>>>> of caching. >>>> >>>> So, the possible solution could be injecting a "Cache-Control: >>>> must-revalidate" header by some eCap reqmod_precache service, then >>>> Squid will revalidate the response on every request carrying new >>>> request headers, then the origin server has its chance to set new >>>> response headers? A little counter-intuitive workaround for class 4 >>>> adaption. Not perfect, since revalidate only occurs only when the >>>> response is stale, >>> >>> That would be 'normal' revalidation operation. Which is why the control >>> exists and is called must-revalidate. To override the normal operation >>> and >>> force revalidation on every request. >>> >>> You could set it in a filter module altering the headers. And repeat the >>> setup on every proxy surrogate as your expand the CDN. It is far easier >>> to >>> send it from the origin which is designed to do set these controls very >>> efficiently and scales perfectly. >>> >> >> So which header forces revalidation on every request, is it cache >> response directive "Cache-Control: max-age=0, must-revalidate"? >> >> Referring to this section of rfc2616, >> http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9, >> 'Cache-Control: must-revalidate' is a cache-response-directive, and as >> cited: >> >> When the must-revalidate directive is present in a response >> received by a cache, that cache MUST NOT use the entry after it >> becomes stale to respond to a subsequent request without first >> revalidating it with the origin server. ... In all circumstances an >> HTTP/1.1 cache MUST obey the must-revalidate directive; in particular, >> if the cache cannot reach the origin server for any reason, it MUST >> generate a 504 (Gateway Timeout) response. >> >> Well, I have some trouble to understand the following transaction, >> where the cached response was stale from the client's perspective, >> Squid really did revalidation and got 304(success, isn't it?), >> however, the client still got "Revalidation failed" warning... The >> Squid was configured >> >> refresh_pattern . 0 20% 4320 >> >> which guessed out a relatively long fresh period. And when origin >> server specifies "Cache-Control: max-age=0, must-revalidate", Squid >> revalidates on each request and warns the client with "Revalidation >> failed". >> >> //============= client -> surrogate >> GET /cgi-bin/index.php HTTP/1.0 >> User-Agent: Wget/1.10.2 (Red Hat modified) >> Accept: */* >> Host: my.example.com >> Connection: Keep-Alive >> Cache-Control: max-age=10 >> >> //============= surrogate -> origin server >> GET /cgi-bin/index.php HTTP/1.1 >> If-Modified-Since: Wed, 19 Oct 2011 11:00:00 GMT >> User-Agent: Wget/1.10.2 (Red Hat modified) >> Accept: */* >> Host: my.example.com >> Via: 1.0 s0.example.com (squid/3.1.16) >> X-Forwarded-For: x.x.x.x >> Cache-Control: max-age=10 >> Connection: keep-alive >> >> //============= origin server -> surrogate >> HTTP/1.1 304 Not Modified >> Date: Thu, 20 Oct 2011 05:26:28 GMT >> Server: Apache/2.2.3 (CentOS) >> Connection: close >> Cache-Control: must-revalidate >> >> //============= Surrogate -> Client >> HTTP/1.0 200 OK >> X-Powered-By: PHP/5.1.6 >> Last-Modified: Wed, 19 Oct 2011 11:00:00 GMT >> Content-Length: 66 >> Content-Type: text/html; charset=UTF-8 >> Date: Thu, 20 Oct 2011 05:26:28 GMT >> Server: Apache/2.2.3 (CentOS) >> Cache-Control: must-revalidate >> Warning: 110 squid/3.1.16 "Response is stale" >> Warning: 111 squid/3.1.16 "Revalidation failed" > > These warnings being present is a bug. The rest of the result is correct. > > The max-age=10 requirement ("nothing more than 10 seconds stale") forces it > to revalidate since the object it has is around 24hrs old. No.. max-age has nothing to do with the around-24hrs "age"(resource age, the amount of time since resource creation or modification, that is since Wed, 19 Oct 2011 11:00:00 GMT); instead, it is compared to response age(the mount of time since origin server serve the transaction, that is since Thu, 20 Oct 2011 05:26:28 GMT). Although the object(resouce) was around 24-hours old, the response could be fresh as long as it had left origin server within 10 seconds. > > The origins must-revalidate also forces revalidation. > > The reply to the client should not have the warnings, since the origin has > indicated that the object is currently valid (304). > > refresh_pattern is not relevant. Since there is a Cache-Control header > present. No estimations need to be made. Clear. > >>> >>>> while what I am looking for is adapting every >>>> response before it leaves Squid for the client. 'Cache-Control: >>>> max-age=0' will force revalidation every response, though. >>> >>> Otherwise known as "force reload". >>> Forces full erasure and new a full new fetch on every request. Not >>> revalidation. >> >> Let's make it clear.. Is the 'Cache-Control: max-age=0' as request >> header that force full erasure, > > No. From the client it simply means revalidate immediately. AND pass on the > max-age=0 to origin. > > Erasure is a side effect of Squid receiving a 200 reply from the > revalidation check. Nothing more. It is very likely to change when multiple > variants are cached. Great! It's clear now! Cited from http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.2.6 We would like the client to use the most recently generated response, even if older responses are still apparently fresh. > > >> while 'Cache-Control: max-age=0' as >> response header simply marks pre-expiration and Squid fells free to >> store a pre-expired response and validates it later when serving next >> request? > > That is correct. > You will just need to check the Squid release is one of the recent ones > which cache pre-expired content. Some earlier ones did not. So latest stable should cache pre-expired content? Do you have any idea since which release that behavior has been introduced? I yet verify thoroughly, but looks like some version of Resin will carry "Cache-Control: no-cache" with each response. It is said that "Cache-Control: no-cache" is equal to "Cache-Control: max-age=0"; I guess it'd better carry "Cache-Control: no-store" instead, to avoid polluting disk cache and to eliminate those I/O, assuming the response is dynamic. > > >> >> Looks like a "Surrogate-Control: max-age=0, revalidate" header >> eliminates the need of a filter module in this case? Not sure about >> the 'Surrogate-Control: revalidate", since it is not listed in Edge >> Architecture Specification, http://www.w3.org/TR/edge-arch, referred >> by http://wiki.squid-cache.org/Features/Surrogate. > > Squid ignores unknown ones presently. If you need, it can be extended. > Although, if you go with max-age=0, revalidate is redundant. How to configure Squid-3.1.16 behaves as a surrogate conforming to Edge Architecture Specification, in particular "Surrogate-Control" overriding "Cache-Control"? I believe only the following directives were related in squid.conf http_port 80 vhost httpd_accel_surrogate_id proxy123.example.com I made the response left origin server carrying both "Surrogate-Control: max-age=61" and "Cache-Control: max-age=100", and found that Squid revalidated only when the response was already 100 seconds old, rather than 61 seconds old. Packet capture shows that it was not acting as a surrogate because"Surrogate-Control: max-age=61" leaked to the client. What am I missing? //============== client -> surrogate GET /cgi-bin/index.php HTTP/1.0 User-Agent: Wget/1.10.2 (Red Hat modified) Accept: */* Host: www.zongheng.com Connection: Keep-Alive Cache-Control: no-cache //=============== surrogate -> origin server GET /cgi-bin/index.php HTTP/1.1 User-Agent: Wget/1.10.2 (Red Hat modified) Accept: */* Host: www.example.com Via: 1.0 s0.example.com (squid/3.1.16) Surrogate-Capability: proxy123.example.com="Surrogate/1.0 ESI/1.0" X-Forwarded-For: x.x.x.x Cache-Control: no-cache Connection: keep-alive //================ origin server -> surrogate HTTP/1.1 200 OK Date: Thu, 20 Oct 2011 19:29:55 GMT Server: Apache/2.2.3 (CentOS) X-Powered-By: PHP/5.1.6 Surrogate-Control: max-age=61 Cache-Control: max-age=100 Last-Modified: Wed, 19 Oct 2011 11:00:00 GMT Content-Length: 66 Connection: close Content-Type: text/html; charset=UTF-8 <h1>It works!</h1><pre>Last-Modified: Wed, 19 Oct 2011 11:00:00GMT //=============== surrogate -> client HTTP/1.0 200 OK Date: Thu, 20 Oct 2011 19:29:55 GMT Server: Apache/2.2.3 (CentOS) X-Powered-By: PHP/5.1.6 Surrogate-Control: max-age=61 Cache-Control: max-age=100 Last-Modified: Wed, 19 Oct 2011 11:00:00 GMT Content-Length: 66 Content-Type: text/html; charset=UTF-8 X-Cache: MISS from s0.example.com X-Cache-Lookup: HIT from s0.example.com:80 Via: 1.0 s0.example.com (squid/3.1.16) Connection: keep-alive <h1>It works!</h1><pre>Last-Modified: Wed, 19 Oct 2011 11:00:00GMT > >>>> >>>> I also chance read ESI which really resembles class 4 adaption with >>>> limited capability that only modifies response body. Looks like it is >>>> incapable of doing custom complex calculation. So Squid does not >>>> support class 4 adaption in general? Any other alternative? >>> >>> ESI, yes is good for personalization of the body. It does not exactly do >>> calculations. It does widget insertion in to pages for personalization at >>> the gateway machine. Allowing caching of the page template and widgets >>> separately within a CDN. >>> >>> You were taking about personalizing Cookies etc, which are not part of >>> the >>> body content. >> >> Sure. A side question: when a surrogate fetches ESI widget, will it >> carry request headers from client(assuming widget is in same domain to >> that of the page) and inject response headers before the page is >> served to client? >> > > I don't think so. It is just a form of body/object macro-expansion. With > some fancy bits for determining which widget to insert. Clear. > >> >> So Squid without the adapter will cache one copy of responses in only >> one encoding. > > Yes. > >> Will "Vary:Accept-Encoding" request header enable >> >> multiply copies? > > No. It tells Squid there are multiple variants with the same URL, and to > check the Accept-Encoding header against the one stored already when > deciding if it is a HIT. Clear. > > > Amos > -- > Please be using > Current Stable Squid 2.7.STABLE9 or 3.1.16 > Beta testers wanted for 3.2.0.13 > Thanks, Kaiwang