On 11/07/2013 11:40 p.m., Eliezer Croitoru wrote:
I have been testing quite some time some urls for cachability.
It seems like there are different methods to request the same file which
leads to different reaction in squid and I want to make sure 100% what
is the cause to the *problem* before I am running to a conclusion since
I am not 100% sure.
Please take your *free* time to read it and see if there is something I
probably missed with hope to understand the issue in hands.\
As you probably noticed from the final diagnois of the last weird case
you brought up it can be important to consider both pairs of
request/reply between both client-squid and squid-server. Either side of
Squid can affect the overall transaction behaviour...
Can you state exactly what the problem is up front? that is a little
unclear from your text.
Thanks Ahead,
Eliezer
I have tried to use wget\curl or firefox and chrome which gave me
another reaction from squid and I want to make sure what the cause for it.
using simple wget of two requests I am getting:
1373541195.850 743 192.168.10.124 TCP_MISS/200 85865 GET
http://image.slidesharecdn.com/glusterorgwebinarant-120126131226-phpapp01/95/slide-29-728.jpg?132986699
- HIER_DIRECT/88.221.156.163 image/jpeg
1373541220.437 4 192.168.10.124 TCP_MEM_HIT/200 85737 GET
http://image.slidesharecdn.com/glusterorgwebinarant-120126131226-phpapp01/95/slide-29-728.jpg?132986699
- HIER_NONE/- image/jpeg
which is a success caching\HIT.
in this request the headers are:
---------
GET
http://image.slidesharecdn.com/glusterorgwebinarant-120126131226-phpapp01/95/slide-11-728.jpg?1329866994
HTTP/1.1
User-Agent: Wget/1.14 (linux-gnu)
Accept: */*
Host: image.slidesharecdn.com
Connection: Close
Proxy-Connection: Keep-Alive
Bug #1 (in the client): Proxy-Connection is an obsolete header but in no
sane system should it ever directly contradict the Connection header
like that.
----------
The response is:
---------
HTTP/1.1 200 OK
x-amz-id-2: wQGOvCvBOH4nVmOEbu1UMJ+Kxv4a4v/9oGpyWnIYy8WRtBL6ZAx2yQtZ0T5u3sfr
x-amz-request-id: 2F83F33589002A74
Last-Modified: Wed, 08 Aug 2012 08:30:58 GMT
x-amz-version-id: _9hthq6oqnMYSuZCVxGCF1sN5VJtYebW
ETag: "cd5970b95914bd43a88a021b78d2f67b"
Content-Type: image/jpeg
Server: AmazonS3
Cache-Control: max-age=31536000
Date: Thu, 11 Jul 2013 11:18:48 GMT
X-Cache: MISS from www1.home
X-Cache-Lookup: MISS from www1.home:3128
Transfer-Encoding: chunked
Connection: keep-alive
Is this the Squid response sent to client after the above request?
If so that would be Bug #2: The client expicitly sent "Connection:close"
and Squid should be obeying that.
If that is the server->squid response there is no bug a the connection
persistence is separate.
----------
and on the second time
---------
HTTP/1.1 200 OK
x-amz-id-2: wQGOvCvBOH4nVmOEbu1UMJ+Kxv4a4v/9oGpyWnIYy8WRtBL6ZAx2yQtZ0T5u3sfr
x-amz-request-id: 2F83F33589002A74
Last-Modified: Wed, 08 Aug 2012 08:30:58 GMT
x-amz-version-id: _9hthq6oqnMYSuZCVxGCF1sN5VJtYebW
ETag: "cd5970b95914bd43a88a021b78d2f67b"
Content-Type: image/jpeg
Server: AmazonS3
Cache-Control: max-age=31536000
Date: Thu, 11 Jul 2013 11:18:48 GMT
Age: 347
X-Cache: HIT from www1.home
X-Cache-Lookup: HIT from www1.home:3128
Transfer-Encoding: chunked
Connection: keep-alive
----------
Which makes it a HIT.
Then there is nothing that seems wrong to the application server way of
doing things and basic squid internals.
While on chrome and firefox there is something different in the request:
---------
GET
/glusterorgwebinarant-120126131226-phpapp01/95/slide-4-728.jpg?1329866994 HTTP/1.1
Host: image.slidesharecdn.com
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML,
like Gecko) Chrome/28.0.1500.71 Safari/537.36
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Cache-Control: max-age=4794000
Connection: keep-alive
----------
that results in a respond:
---------
HTTP/1.1 200 OK
x-amz-id-2: VmcmoZnkiG7I/OEc+VJxJJKS7fnsu+BCqEw4NqVuMC7ckHl+DEYidi4P1d1vflRK
x-amz-request-id: BC59D681FF091B4E
Last-Modified: Wed, 08 Aug 2012 08:30:56 GMT
x-amz-version-id: kCNUG8l6HMz03fgYIbYHlsGJmzD3CplD
ETag: "4a351b56fb96496224d67ae752c75386"
Accept-Ranges: bytes
Content-Type: image/jpeg
Server: AmazonS3
Vary: Accept-Encoding
Bug #3 (in the server): Vary header has suddenly appeared. It should be
sent on all responses to this URL regardess of whether the variant
headers existed in the client response.
This object will be cached under the store location:
hash(URL)+hash("gzip,deflate,sdch")
Content-Encoding: gzip
Cache-Control: max-age=31536000
Date: Thu, 11 Jul 2013 11:26:53 GMT
Content-Length: 48511
X-Cache: MISS from www1.home
X-Cache-Lookup: MISS from www1.home:3128
Connection: keep-alive
---------
while the next chrome request treated as a 304:
---------
HTTP/1.1 304 Not Modified
Content-Type: image/jpeg
Last-Modified: Wed, 08 Aug 2012 08:30:56 GMT
ETag: "4a351b56fb96496224d67ae752c75386"
Cache-Control: max-age=31536000
Date: Thu, 11 Jul 2013 11:29:00 GMT
Connection: keep-alive
Vary: Accept-Encoding
Object is cacheable for a while and Chrome is requesting using
"Accept-Encoding:gzip,deflate,sdch" which allows it to locate the Vary
cached object at hash(URL)+hash("gzip,deflate,sdch").
There is no "Server:" header in the 304 indicating which service
produced the 304 reply.
----------
<...>
HTTP Client REPLY:
---------
HTTP/1.1 304 Not Modified
Content-Type: image/jpeg
Last-Modified: Wed, 08 Aug 2012 08:30:56 GMT
ETag: "4a351b56fb96496224d67ae752c75386"
Cache-Control: max-age=31536000
Date: Thu, 11 Jul 2013 11:29:00 GMT
Vary: Accept-Encoding
X-Cache: MISS from www1.home
X-Cache-Lookup: MISS from www1.home:3128
Connection: keep-alive
----------
So the application server responds to the 304.. and not squid since
squid is obligated to respond with a valid http response.
So chrome verifies that his local cache is valid and its fine.
The next scenario is when chrome force a no-cache in the Cache-Control
header.
---------
GET
/glusterorgwebinarant-120126131226-phpapp01/95/slide-4-728.jpg?1329866994 HTTP/1.1
Host: image.slidesharecdn.com
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Pragma: no-cache
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML,
like Gecko) Chrome/28.0.1500.71 Safari/537.36
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Cache-Control: no-cache
Connection: keep-alive
----------
<...>
HTTP Server REPLY:
---------
HTTP/1.1 200 OK
x-amz-id-2: VmcmoZnkiG7I/OEc+VJxJJKS7fnsu+BCqEw4NqVuMC7ckHl+DEYidi4P1d1vflRK
x-amz-request-id: BC59D681FF091B4E
Last-Modified: Wed, 08 Aug 2012 08:30:56 GMT
x-amz-version-id: kCNUG8l6HMz03fgYIbYHlsGJmzD3CplD
ETag: "4a351b56fb96496224d67ae752c75386"
Accept-Ranges: bytes
Content-Type: image/jpeg
Server: AmazonS3
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 48511
Cache-Control: max-age=31536000
Date: Thu, 11 Jul 2013 11:34:03 GMT
Connection: keep-alive
�
----------
I am not sure but I want to debug this issue if there is one.
The request should have been served from cache since the refresh_pattern
is pretty explicit about it.
I explicitly made "ignore-no-cache" refresh_pattern option obsolete when
upgrading the no-cache support in Squid-3.2 and later.
For several reasons;
1) it never actually applied on request header no-cache like that one.
The http_port "ignore-cc" option available to reverse proxy
installations does that with varied success or problems resulting
depending on the site behaviour.
2) no-cache in the reply means revalidate with the server before using
cached copy. Revalidate ensures accurate content is delivered and
without much bandwidth usage when teh server has IMS request support.
3) no-cache permits responses fetched by authenticated users be cached.
Ignoring the no-cache revalidation requirement in that situation is a
bit dangerous. Ignoring the no-cache and making those auth responses
uncacheable again contradicts the common usage of "ignore-no-cache" to
increase caching of objects.
4) The refresh_pattern is used widely on older Squid to enable caching
of the responses with "no-cache" in it, since those versions would treat
it as an alternative of "no-store". Now that Squid is treating it as an
alternative to "must-revalidate" the benefit to those caches is gone.
The behaviour change seen by them would be simply the enhanced dangerous
side effects of (3).
I know how to read the headers and what is suppose to be but I am a bit
confused and unable to reach the right conclusion to the root of why
squid will treat the wget request differently from chrome requests.
Any new point of view will help me.
Probably the server lack of Vary on the wget responses. If not that the
max-age requirement sent in by Chrome on its fetches.
Could also be Bug #4 in Squid: the Vary being seen as MISS in recent
Squid releases. Which we have not yet dug out of the sources.
Amos