2011/10/20 Amos Jeffries <squid3@xxxxxxxxxxxxx>: > On Thu, 20 Oct 2011 00:39:32 +0800, Kaiwang Chen wrote: >> >> 2011/10/19 Amos Jeffries: >>> >>> On Wed, 19 Oct 2011 05:15:22 +0800, Kaiwang Chen wrote: > > <snip> >>> >>> To only change the HTTP headers, there are some tricks you can do with >>> the >>> "must-revalidate" and/or "proxy-revalidate" cache control. These controls >>> causes the surrogate to contact the origin web server on every request. >>> The >>> origin can send back new headers on a 304 not-modified response. Meaning >>> the >>> headers get changed per-response, but the cached body gets sent only when >>> actually changed. Retaining most of the bandwidth and performance >>> benefits >>> of caching. >> >> So, the possible solution could be injecting a "Cache-Control: >> must-revalidate" header by some eCap reqmod_precache service, then >> Squid will revalidate the response on every request carrying new >> request headers, then the origin server has its chance to set new >> response headers? A little counter-intuitive workaround for class 4 >> adaption. Not perfect, since revalidate only occurs only when the >> response is stale, > > That would be 'normal' revalidation operation. Which is why the control > exists and is called must-revalidate. To override the normal operation and > force revalidation on every request. > > You could set it in a filter module altering the headers. And repeat the > setup on every proxy surrogate as your expand the CDN. It is far easier to > send it from the origin which is designed to do set these controls very > efficiently and scales perfectly. > So which header forces revalidation on every request, is it cache response directive "Cache-Control: max-age=0, must-revalidate"? Referring to this section of rfc2616, http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9, 'Cache-Control: must-revalidate' is a cache-response-directive, and as cited: When the must-revalidate directive is present in a response received by a cache, that cache MUST NOT use the entry after it becomes stale to respond to a subsequent request without first revalidating it with the origin server. ... In all circumstances an HTTP/1.1 cache MUST obey the must-revalidate directive; in particular, if the cache cannot reach the origin server for any reason, it MUST generate a 504 (Gateway Timeout) response. Well, I have some trouble to understand the following transaction, where the cached response was stale from the client's perspective, Squid really did revalidation and got 304(success, isn't it?), however, the client still got "Revalidation failed" warning... The Squid was configured refresh_pattern . 0 20% 4320 which guessed out a relatively long fresh period. And when origin server specifies "Cache-Control: max-age=0, must-revalidate", Squid revalidates on each request and warns the client with "Revalidation failed". //============= client -> surrogate GET /cgi-bin/index.php HTTP/1.0 User-Agent: Wget/1.10.2 (Red Hat modified) Accept: */* Host: my.example.com Connection: Keep-Alive Cache-Control: max-age=10 //============= surrogate -> origin server GET /cgi-bin/index.php HTTP/1.1 If-Modified-Since: Wed, 19 Oct 2011 11:00:00 GMT User-Agent: Wget/1.10.2 (Red Hat modified) Accept: */* Host: my.example.com Via: 1.0 s0.example.com (squid/3.1.16) X-Forwarded-For: x.x.x.x Cache-Control: max-age=10 Connection: keep-alive //============= origin server -> surrogate HTTP/1.1 304 Not Modified Date: Thu, 20 Oct 2011 05:26:28 GMT Server: Apache/2.2.3 (CentOS) Connection: close Cache-Control: must-revalidate //============= Surrogate -> Client HTTP/1.0 200 OK X-Powered-By: PHP/5.1.6 Last-Modified: Wed, 19 Oct 2011 11:00:00 GMT Content-Length: 66 Content-Type: text/html; charset=UTF-8 Date: Thu, 20 Oct 2011 05:26:28 GMT Server: Apache/2.2.3 (CentOS) Cache-Control: must-revalidate Warning: 110 squid/3.1.16 "Response is stale" Warning: 111 squid/3.1.16 "Revalidation failed" X-Cache: HIT from s0.example.com X-Cache-Lookup: HIT from s0.example.com:80 Via: 1.0 s0.example.com (squid/3.1.16) Connection: keep-alive <h1>It works!</h1><pre>Last-Modified: Wed, 19 Oct 2011 11:00:00GMT > >> while what I am looking for is adapting every >> response before it leaves Squid for the client. 'Cache-Control: >> max-age=0' will force revalidation every response, though. > > Otherwise known as "force reload". > Forces full erasure and new a full new fetch on every request. Not > revalidation. Let's make it clear.. Is the 'Cache-Control: max-age=0' as request header that force full erasure, while 'Cache-Control: max-age=0' as response header simply marks pre-expiration and Squid fells free to store a pre-expired response and validates it later when serving next request? Looks like a "Surrogate-Control: max-age=0, revalidate" header eliminates the need of a filter module in this case? Not sure about the 'Surrogate-Control: revalidate", since it is not listed in Edge Architecture Specification, http://www.w3.org/TR/edge-arch, referred by http://wiki.squid-cache.org/Features/Surrogate. > >> >> I also chance read ESI which really resembles class 4 adaption with >> limited capability that only modifies response body. Looks like it is >> incapable of doing custom complex calculation. So Squid does not >> support class 4 adaption in general? Any other alternative? > > ESI, yes is good for personalization of the body. It does not exactly do > calculations. It does widget insertion in to pages for personalization at > the gateway machine. Allowing caching of the page template and widgets > separately within a CDN. > > You were taking about personalizing Cookies etc, which are not part of the > body content. Sure. A side question: when a surrogate fetches ESI widget, will it carry request headers from client(assuming widget is in same domain to that of the page) and inject response headers before the page is served to client? > >> >>> >>> NP: this trick with 304 is only possible for headers which do not update >>> headers with details about the particular body object. ie you can use it >>> for >>> altering Cookie values per-request, but not for changing the apparent >>> Content-Encoding from gzip to deflate. For things affecting the body you >>> use >>> the normal 200 response and send the updated body as well. >> >> Sure. >> >> BTW, I tried the gzip compression adapter from >> http://code.google.com/p/squid-ecap-gzip/, and found that after a >> request carrying "Accept-Encoding: gzip", Squid always passes back >> gzip'ed response to the client, even it no longer carries that header, >> because the object is not modified. A request without gzip support and >> with 'Cache-Control: no-cache' refreshes the cache to be always >> returning plain text responses. Does it imply that Squid only caches >> one copy of response, rather than one per each enconding? How to make >> it serve other encoding different from the cached one? > > Sounds like the adapter is not working. What you describe is normal Squid > behaviour without the adapter. > > IIRC the module was supposed to update the background requests to prefer > gzipped, and itself do the un-zipping when an identity encoded response was > required by the client. So Squid without the adapter will cache one copy of responses in only one encoding. Will "Vary:Accept-Encoding" request header enable multiply copies? > > > Amos > Thanks, Kaiwang