Search squid archive

Re: An example to squid cache affecting user-agents(Firefox,Chrome,wget\curl)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/07/2013 11:40 p.m., Eliezer Croitoru wrote:
I have been testing quite some time some urls for cachability.
It seems like there are different methods to request the same file which
leads to different reaction in squid and I want to make sure 100% what
is the cause to the *problem* before I am running to a conclusion since
I am not 100% sure.
Please take your *free* time to read it and see if there is something I
probably missed with hope to understand the issue in hands.\

As you probably noticed from the final diagnois of the last weird case you brought up it can be important to consider both pairs of request/reply between both client-squid and squid-server. Either side of Squid can affect the overall transaction behaviour...

Can you state exactly what the problem is up front? that is a little unclear from your text.

Thanks Ahead,
Eliezer

I have tried to use wget\curl or firefox and chrome which gave me
another reaction from squid and I want to make sure what the cause for it.
using simple wget of two requests I am getting:
1373541195.850    743 192.168.10.124 TCP_MISS/200 85865 GET
http://image.slidesharecdn.com/glusterorgwebinarant-120126131226-phpapp01/95/slide-29-728.jpg?132986699
- HIER_DIRECT/88.221.156.163 image/jpeg
1373541220.437      4 192.168.10.124 TCP_MEM_HIT/200 85737 GET
http://image.slidesharecdn.com/glusterorgwebinarant-120126131226-phpapp01/95/slide-29-728.jpg?132986699
- HIER_NONE/- image/jpeg


which is a success caching\HIT.
in this request the headers are:
---------
GET
http://image.slidesharecdn.com/glusterorgwebinarant-120126131226-phpapp01/95/slide-11-728.jpg?1329866994
HTTP/1.1
User-Agent: Wget/1.14 (linux-gnu)
Accept: */*
Host: image.slidesharecdn.com
Connection: Close
Proxy-Connection: Keep-Alive


Bug #1 (in the client): Proxy-Connection is an obsolete header but in no sane system should it ever directly contradict the Connection header like that.

----------
The response is:
---------
HTTP/1.1 200 OK
x-amz-id-2: wQGOvCvBOH4nVmOEbu1UMJ+Kxv4a4v/9oGpyWnIYy8WRtBL6ZAx2yQtZ0T5u3sfr
x-amz-request-id: 2F83F33589002A74
Last-Modified: Wed, 08 Aug 2012 08:30:58 GMT
x-amz-version-id: _9hthq6oqnMYSuZCVxGCF1sN5VJtYebW
ETag: "cd5970b95914bd43a88a021b78d2f67b"
Content-Type: image/jpeg
Server: AmazonS3
Cache-Control: max-age=31536000
Date: Thu, 11 Jul 2013 11:18:48 GMT
X-Cache: MISS from www1.home
X-Cache-Lookup: MISS from www1.home:3128
Transfer-Encoding: chunked
Connection: keep-alive

Is this the Squid response sent to client after the above request?
If so that would be Bug #2: The client expicitly sent "Connection:close" and Squid should be obeying that. If that is the server->squid response there is no bug a the connection persistence is separate.

----------
and on the second time
---------
HTTP/1.1 200 OK
x-amz-id-2: wQGOvCvBOH4nVmOEbu1UMJ+Kxv4a4v/9oGpyWnIYy8WRtBL6ZAx2yQtZ0T5u3sfr
x-amz-request-id: 2F83F33589002A74
Last-Modified: Wed, 08 Aug 2012 08:30:58 GMT
x-amz-version-id: _9hthq6oqnMYSuZCVxGCF1sN5VJtYebW
ETag: "cd5970b95914bd43a88a021b78d2f67b"
Content-Type: image/jpeg
Server: AmazonS3
Cache-Control: max-age=31536000
Date: Thu, 11 Jul 2013 11:18:48 GMT
Age: 347
X-Cache: HIT from www1.home
X-Cache-Lookup: HIT from www1.home:3128
Transfer-Encoding: chunked
Connection: keep-alive


----------
Which makes it a HIT.
Then there is nothing that seems wrong to the application server way of
doing things and basic squid internals.
While on chrome and firefox there is something different in the request:
---------
GET
/glusterorgwebinarant-120126131226-phpapp01/95/slide-4-728.jpg?1329866994 HTTP/1.1
Host: image.slidesharecdn.com
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML,
like Gecko) Chrome/28.0.1500.71 Safari/537.36
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Cache-Control: max-age=4794000
Connection: keep-alive


----------
that results in a respond:
---------
HTTP/1.1 200 OK
x-amz-id-2: VmcmoZnkiG7I/OEc+VJxJJKS7fnsu+BCqEw4NqVuMC7ckHl+DEYidi4P1d1vflRK
x-amz-request-id: BC59D681FF091B4E
Last-Modified: Wed, 08 Aug 2012 08:30:56 GMT
x-amz-version-id: kCNUG8l6HMz03fgYIbYHlsGJmzD3CplD
ETag: "4a351b56fb96496224d67ae752c75386"
Accept-Ranges: bytes
Content-Type: image/jpeg
Server: AmazonS3
Vary: Accept-Encoding

Bug #3 (in the server): Vary header has suddenly appeared. It should be sent on all responses to this URL regardess of whether the variant headers existed in the client response.

This object will be cached under the store location: hash(URL)+hash("gzip,deflate,sdch")

Content-Encoding: gzip
Cache-Control: max-age=31536000
Date: Thu, 11 Jul 2013 11:26:53 GMT
Content-Length: 48511
X-Cache: MISS from www1.home
X-Cache-Lookup: MISS from www1.home:3128
Connection: keep-alive


---------

while the next chrome request treated as a 304:
---------
HTTP/1.1 304 Not Modified
Content-Type: image/jpeg
Last-Modified: Wed, 08 Aug 2012 08:30:56 GMT
ETag: "4a351b56fb96496224d67ae752c75386"
Cache-Control: max-age=31536000
Date: Thu, 11 Jul 2013 11:29:00 GMT
Connection: keep-alive
Vary: Accept-Encoding

Object is cacheable for a while and Chrome is requesting using "Accept-Encoding:gzip,deflate,sdch" which allows it to locate the Vary cached object at hash(URL)+hash("gzip,deflate,sdch").

There is no "Server:" header in the 304 indicating which service produced the 304 reply.


----------
<...>
HTTP Client REPLY:
---------
HTTP/1.1 304 Not Modified
Content-Type: image/jpeg
Last-Modified: Wed, 08 Aug 2012 08:30:56 GMT
ETag: "4a351b56fb96496224d67ae752c75386"
Cache-Control: max-age=31536000
Date: Thu, 11 Jul 2013 11:29:00 GMT
Vary: Accept-Encoding
X-Cache: MISS from www1.home
X-Cache-Lookup: MISS from www1.home:3128
Connection: keep-alive


----------
So the application server responds to the 304.. and not squid since
squid is obligated to respond with a valid http response.
So chrome verifies that his local cache is valid and its fine.

The next scenario is when chrome force a no-cache in the Cache-Control
header.
---------
GET
/glusterorgwebinarant-120126131226-phpapp01/95/slide-4-728.jpg?1329866994 HTTP/1.1
Host: image.slidesharecdn.com
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Pragma: no-cache
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML,
like Gecko) Chrome/28.0.1500.71 Safari/537.36
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Cache-Control: no-cache
Connection: keep-alive


----------
<...>
HTTP Server REPLY:
---------
HTTP/1.1 200 OK
x-amz-id-2: VmcmoZnkiG7I/OEc+VJxJJKS7fnsu+BCqEw4NqVuMC7ckHl+DEYidi4P1d1vflRK
x-amz-request-id: BC59D681FF091B4E
Last-Modified: Wed, 08 Aug 2012 08:30:56 GMT
x-amz-version-id: kCNUG8l6HMz03fgYIbYHlsGJmzD3CplD
ETag: "4a351b56fb96496224d67ae752c75386"
Accept-Ranges: bytes
Content-Type: image/jpeg
Server: AmazonS3
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 48511
Cache-Control: max-age=31536000
Date: Thu, 11 Jul 2013 11:34:03 GMT
Connection: keep-alive

�
----------
I am not sure but I want to debug this issue if there is one.
The request should have been served from cache since the refresh_pattern
is pretty explicit about it.

I explicitly made "ignore-no-cache" refresh_pattern option obsolete when upgrading the no-cache support in Squid-3.2 and later.

For several reasons;
1) it never actually applied on request header no-cache like that one. The http_port "ignore-cc" option available to reverse proxy installations does that with varied success or problems resulting depending on the site behaviour.

2) no-cache in the reply means revalidate with the server before using cached copy. Revalidate ensures accurate content is delivered and without much bandwidth usage when teh server has IMS request support.

3) no-cache permits responses fetched by authenticated users be cached. Ignoring the no-cache revalidation requirement in that situation is a bit dangerous. Ignoring the no-cache and making those auth responses uncacheable again contradicts the common usage of "ignore-no-cache" to increase caching of objects.

4) The refresh_pattern is used widely on older Squid to enable caching of the responses with "no-cache" in it, since those versions would treat it as an alternative of "no-store". Now that Squid is treating it as an alternative to "must-revalidate" the benefit to those caches is gone. The behaviour change seen by them would be simply the enhanced dangerous side effects of (3).


I know how to read the headers and what is suppose to be but I am a bit
confused and unable to reach the right conclusion to the root of why
squid will treat the wget request differently from chrome requests.
Any new point of view will help me.

Probably the server lack of Vary on the wget responses. If not that the max-age requirement sent in by Chrome on its fetches.

Could also be Bug #4 in Squid: the Vary being seen as MISS in recent Squid releases. Which we have not yet dug out of the sources.

Amos




[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux