Re: Unable to cache object within Squid

Amos Jeffries <squid3@xxxxxxxxxxxxx> · Fri, 28 Jun 2013 16:49:13 +1200

On 28/06/2013 2:44 p.m., Reid Watson wrote:
Hi Amos

Answer to your questions

What exactly is this 504 error saying?

- Timeout issue to the backend servers
- The servers will be in an overload state because all the requests 
are being sent to the backend

And why are the 200 status when they do occur taking so very long to 
generate?

- The load balancer will be switching from PRD to DR because of PRD 
overload state

Our Setup

-----> = traffic flow

Apache / Squid ----------> F5 Load balancer ------> Origin servers 
(CMS System)

This configuration allows us to Load Balance between PRD and DR 
servers (CMS)
- When the Purge request is sent tp the Backend Origin servers the cms 
system goes into an overload state
- The health checks on the Load balancer will fail and divert the 
traffic from PRD to DR backend  (PRD and DR are replicated environments)

Extra Info -- Purge Option

- We Provide the University Staff a PHP page to clear objects of squid 
cache

Okay. It is probably time to re-evaluate why that system is being used, 
and what for. Squid now has enough HTTP/1.1 protocol support that PURGE 
should never be necessary.

Also, PURGE does not work reliably on any URL containing Vary: feature. 
Because the headers used on the PURGE request are difficult to match up 
to the original headers used by the client who caused the content to 
enter cache in the first place the cached variants (yes multiple) are 
not reliably removed from the cache. Only the Vary indexing object will 
be dropped, and one more request for the URL will reinstate that and 
make the non-removed copies visible again.

- The tool calls CURL and sends a Purge request

Code

   $ch = curl_init();
   curl_setopt($ch, CURLOPT_URL, $value);
   curl_setopt($ch, CURLOPT_PROXY, $proxyserver);
   curl_setopt($ch, CURLOPT_CUSTOMREQUEST , "PURGE");
   curl_setopt($ch, CURLOPT_TIMEOUT, 5);
   curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
   curl_setopt($ch, CURLOPT_FRESH_CONNECT, true);
    curl_setopt($ch, CURLOPT_HEADER, true);
    curl_setopt($ch, CURLOPT_NOBODY, true);

    if (!curl_exec($ch)) { return false; }
    curl_close($ch);

NOTE: for the below comments I assume your Squid is obeying the HTTP 
protocol. If you have configured refresh_pattern to ignore or override 
any of the protocol instructions then your proxy is not operating 
correctly and any kind of strange behaviour may result in the traffic.

What are the headers sent from the origin server to Squid?

*Cached Request from Local Workstation to Squid *

reidwatson:script rwat090$ curl -I http://www.test.auckland.ac.nz/uoa/ 
<http://www.test.auckland.ac.nz/uoa/>
HTTP/1.1 200 OK
Date: Fri, 28 Jun 2013 02:21:45 GMT
Server: Apache-Coyote/1.1
Cache-Control: no-cache
Pragma: no-cache

no-cache in HTTP response is just a shorter way of saying 
"must-revalidate". Squid 3.2 and later may cache these and pass a 
revalidate request to the origin server on every fetch before 
considering the cached objects as usable. If the origins copy ever 
changes or diappears the cached one will be updated ... there is no 
reason to ever need PURGE on this URL or any other using "no-cache".

Expires: Thu, 01 Jan 1970 00:00:00 GMT

This object is expired before Squid ever got it. Your Squid should not 
be generating HIT's on it to begin with, so PURGE should be having no 
effect whatsoever.

Content-Type: text/html;charset=UTF-8
Age: 77
X-Cache: HIT from webroute.test.auckland.ac.nz
X-Cache-Lookup: HIT from webroute.test.auckland.ac.nz:3128
Via: 1.0 webroute-test1.its.auckland.ac.nz <http://its.auckland.ac.nz> 
(squid)
Set-Cookie: webroute=server.; path=/;
Vary: Accept-Encoding

see above about Vary.

Set-Cookie: 
BIGipServer~Devtest~webroute-test-80_pool=310641930.20480.0000; path=/

My theory about why the behaviour has changed so much is that the new 
Squid is probably obeying no-cache directive and sending IMS requests to 
the backend servers. Which would raise their loading, and cause issues 
when the PURGE removed the stale backup copy from cache.

IMO, you need to consider making that Expires header sometime in the 
future to permit Squid to cache the response properly without config 
hacks. Fixing this up will also remove any need for config file hacks 
about ignoring Expires, and reduce your public bandwidth.

Also, if the origin(s) can be overloaded by work generating or 
validating the URL under your normal traffic rates removing that 
no-cache would be a good idea. A fixed Expires header or max-age=N 
control is a good replacement for frequently updated resources.

You may also find the Cache-Control:stale-if-error=60 useful in this 
case to make Squid use its cached copy for 60 seconds after it becomes 
stale if the origin returns a 5xx status code.

*Request after PURGE --  Local Workstation to Squid then Backend server *

1372386101.882      0 10.1.1.1 TCP_MISS/200 295 PURGE 
http://www.test.auckland.ac.nz/uoa/ 
<http://www.test.auckland.ac.nz/uoa/> - NONE/- -

curl -I http://www.test.auckland.ac.nz/uoa/
HTTP/1.1 200 OK
Date: Fri, 28 Jun 2013 02:21:45 GMT
Server: Apache-Coyote/1.1
Cache-Control: no-cache
Pragma: no-cache
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Content-Type: text/html;charset=UTF-8
X-Cache: MISS from webroute.test.auckland.ac.nz
X-Cache-Lookup: MISS from webroute.test.auckland.ac.nz:3128
Via: 1.0 webroute-test1.its.auckland.ac.nz <http://its.auckland.ac.nz> 
(squid)
Set-Cookie: JSESSIONID=40F3613FE4AC815A87CAC858EE38E112.jvm1; Path=/
Set-Cookie: Jesi_Template_Username=guest; Path=/
Set-Cookie: webroute=server.; path=/;
Vary: Accept-Encoding
Set-Cookie: 
BIGipServer~Devtest~webroute-test-80_pool=310641930.20480.0000; path=/

*I have also include a wire shark dump from the request after the 
purge -- From Backend server *

GET /uoa/ HTTP/1.1
Host: www.test.auckland.ac.nz <http://www.test.auckland.ac.nz>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) 
Gecko/20100101 Firefox/21.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip, deflate
Cookie: __utma=18951208.362154167.1367379679.1372030825.1372114069.43; 
__utmz=18951208.1367379679.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); 
__utma=265832586.969691744.1368401175.1372300411.1372385456.16; 
__utmz=265832586.1372193146.14.3.utmcsr=iam.test.auckland.ac.nz|utmccn=(referral)|utmcmd=referral|utmcct=/profile/SAML2/Redirect/SSO; 
__utmc=18951208; 
BIGipServer~Devtest~webroute-test-443_pool=327419146.47873.0000; 
webroute=server.; __utmc=265832586; 
_shibsession_63656e7472616c2d74657374687474703a2f2f7777772e746573742e6175636b6c616e642e61632e6e7a=_298760cf8dc4f7d5ae29a97b3febbf5f; 
JSESSIONID=F81F7E49D3DE061EE0358E77B9D604EC.jvm1; 
Jesi_Template_Username=guest; 
BIGipServer~Devtest~webroute-test-80_pool=327419146.20480.0000; 
__utmb=265832586.4.10.1372385456; __atuvc=2%7C26
X-Forwarded-Protocol: http
X-Forwarded-User:
X-Forwarded-Host: www.test.auckland.ac.nz <http://www.test.auckland.ac.nz>
X-Forwarded-Server: www.test.auckland.ac.nz 
<http://www.test.auckland.ac.nz>
Via: 1.1 webroute-test1.its.auckland.ac.nz <http://its.auckland.ac.nz> 
(squid)
Surrogate-Capability: unset-id="Surrogate/1.0 ESI/1.0"

Aha. You have a reverse-proxy. If you set the surrogate_id you can 
configure the origin servers to send your Squid targeted Cache-Control 
settings in a Surrogate-Control header.

X-Forwarded-For: 130.216.168.2, 127.0.0.1
Cache-Control: max-age=259200
Connection: keep-alive

Cheers

Reid

On 27/06/2013, at 2:34 AM, Amos Jeffries <squid3@xxxxxxxxxxxxx 
<mailto:squid3@xxxxxxxxxxxxx>> wrote:

On 25/06/2013 5:51 p.m., Reid Watson wrote:
Hi,

We seem to be having a strange issue with SQUID caching an object 
once it has been purged - this issue does not occur with all objects 
within the cache..

Quick Overview

- Four Apache Server with Squid Installed
- RedHat 5.7 i386
- squid-3.1.8-1.el5

Issue

- Admin sends a Purge request simultaneously to all four server

Example
1372113766.459      0 10.10.10.10 TCP_MISS/200 280 PURGE 
http://www.auckland.ac.nz/uoa/ <http://www.auckland.ac.nz/uoa/> - 
NONE/- -

- Squid seems to fail on caching the object
- All requests are sent to backend servers

Server 1
1372113766.459      0 10.10.10.10 TCP_MISS/200 280 PURGE 
http://www.auckland.ac.nz/uoa/ <http://www.auckland.ac.nz/uoa/> - 
NONE/- -
372113766.779     23 10.10.10.3 TCP_MISS/504 5063 GET 
http://www.auckland.ac.nz/uoa/ <http://www.auckland.ac.nz/uoa/> - 
NONE/- text/html

What exactly is this 504 error saying?

1372114043.139 106378 127.0.0.1 TCP_MISS/200 9891 GET 
http://www.auckland.ac.nz/uoa/ <http://www.auckland.ac.nz/uoa/> - 
FIRST_UP_PARENT/vip-dr text/html
1372114043.303  10112 127.0.0.1 TCP_MISS/500 24731 GET 
http://www.auckland.ac.nz/uoa/ <http://www.auckland.ac.nz/uoa/> - 
ANY_PARENT/vip-dr text/html
1372114105.918      0 10.10.10.3 TCP_MISS/504 4591 GET 
http://www.auckland.ac.nz/uoa/ <http://www.auckland.ac.nz/uoa/> - 
NONE/- text/html
1372114106.414  57182 127.0.0.1 TCP_MISS/200 69995 GET 
http://www.auckland.ac.nz/uoa/ <http://www.auckland.ac.nz/uoa/> - 
FIRST_UP_PARENT/vip-dr text/html
1372114106.419  55771 127.0.0.1 TCP_MISS/200 45853 GET 
http://www.auckland.ac.nz/uoa/ <http://www.auckland.ac.nz/uoa/> - 
ANY_PARENT/vip-dr text/html
1372114106.436  17387 127.0.0.1 TCP_MISS/200 69967 GET 
http://www.auckland.ac.nz/uoa/ <http://www.auckland.ac.nz/uoa/> - 
FIRST_UP_PARENT/vip-dr text/html
1372114106.490  60516 127.0.0.1 TCP_MISS/200 70115 GET 
http://www.auckland.ac.nz/uoa/ <http://www.auckland.ac.nz/uoa/> - 
FIRST_UP_PARENT/vip-dr text/html
1372114106.707  58628 127.0.0.1 TCP_MISS/200 70115 GET 
http://www.auckland.ac.nz/uoa/ <http://www.auckland.ac.nz/uoa/> - 
ANY_PARENT/vip-dr text/html

And why are the 200 status when they do occur taking so very long to 
generate?

- if you send a reload command (/sbin/service squid reload) the 
object is stored within the cache instanlty

What are the headers sent from the origin server to Squid?

Amos