Hi Robin,
AFAICT from the logs you have privately shared and your squid.conf
that you have posted earlier, your Squid overwrites
stale-if-error=31536000 in the response with "refresh_pattern
max-stale=0" default. That 0 value is wrong. The correct value should be
taken from max_stale directive that defaults to 1 week, not zero:
refresh_pattern
...
max-stale=NN provide a maximum staleness factor. Squid won't
serve objects more stale than this even if it failed to
validate the object. Default: use the max_stale global limit.
This wrong default is a Squid bug AFAICT. I posted an _untested_ fix as
Squid PR 1664: https://github.com/squid-cache/squid/pull/1664
If possible, please test the corresponding patch:
https://github.com/squid-cache/squid/commit/571973589b5a46d458311f8b60dcb83032fd5cec.patch
AFAICT, you can also work around that bug by configuring an explicit
refresh_pattern rule with an explicit max-stale option (see
squid.conf.documented for examples). I have not tested that theory either.
HTH,
Alex.
On 2024-02-07 13:45, Robin Carlisle wrote:
Hi,
I have just started my enhanced logging journey and have a small snippet
below that might illuminate the issue ...
/2024/02/07 17:06:39.212 kid1| 88,3| client_side_reply.cc(507)
handleIMSReply: origin replied with error 502, forwarding to client due
to fail_on_validation_err/
A few lines below in the log it looks like squid sent :-
/2024/02/07 17:06:39.212 kid1| 11,2| Stream.cc(280) sendStartOfMessage:
HTTP Client REPLY:
---------
HTTP/1.1 502 Bad Gateway
Server: squid/5.7
Mime-Version: 1.0
Date: Wed, 07 Feb 2024 17:06:39 GMT
Content-Type: text/html;charset=utf-8
Content-Length: 3853
X-Squid-Error: ERR_READ_ERROR 0
Vary: Accept-Language
Content-Language: en
X-Cache: MISS from labs-maul-st-15
X-Cache-Lookup: HIT from labs-maul-st-15:3129
Via: 1.1 labs-maul-st-15 (squid/5.7)
Connection: close/
The rest of the logs are quite large and contain URLs I cannot put
here. The logs were generated with debug_options to ALL,3.
Any ideas? Or should I generate more detailed logs and send them
privately?
Thanks again,
Robin
On Fri, 2 Feb 2024 at 11:20, Robin Carlisle
<robin.carlisle@xxxxxxxxxxxxxx <mailto:robin.carlisle@xxxxxxxxxxxxxx>>
wrote:
Hi, thanks for your reply.
I have been looking at :
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control <https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control>
/The stale-if-error response directive indicates that the cache can
reuse a stale response when an upstream server generates an error,
or when the error is generated locally. Here, an error is considered
any response with a status code of 500, 502, 503, or 504.
Cache-Control: max-age=604800, stale-if-error=86400
In the example above, the response is fresh for 7 days (604800s).
Afterwards, it becomes stale, but can be used for an extra 1 day
(86400s) when an error is encountered.
After the stale-if-error period passes, the client will receive any
error generated/
Given what you have said and what the above docs say - I am still
confused as it looks like (in my test cases) the cached response can
be used for 3600 secs (this works), after which the cached response
can still be used for an additional 31536000 seconds on an error
(this doesnt work).
I am going to dig into the error logging you suggested to see if I
can make sense of that - and will send on if I can't.
Thanks v much for your help again,
Robin
On Thu, 1 Feb 2024 at 18:27, Alex Rousskov
<rousskov@xxxxxxxxxxxxxxxxxxxxxxx
<mailto:rousskov@xxxxxxxxxxxxxxxxxxxxxxx>> wrote:
On 2024-02-01 12:03, Robin Carlisle wrote:
> Hi, I am having trouble with stale-if-error response.
If I am interpreting Squid code correctly, in primary use cases:
* without a Cache-Control:stale-if-error=X in the original
response,
Squid sends a stale object if revalidation results in a 5xx error;
* with a Cache-Control:stale-if-error=X and object age at most
X, Squid
sends a stale object if revalidation results in a 5xx error;
* with a Cache-Control:stale-if-error=X and object age exceeding X,
Squid forwards the 5xx error response if revalidation results in
a 5xx
error;
In other words, stale-if-error=X turns on a "fail on validation
errors"
behavior for stale objects older than X. It has no other effects.
In your test case, the stale objects are much younger than
stale-if-error value (e.g., Age~=3601 vs. stale-if-error=31536000).
Thus, stale-if-error should have no relevant effect.
Something else is probably preventing your Squid from serving
the stale
response when facing a 5xx error. I do not know what that
something is.
I recommend sharing (privately if you need to protect sensitive
info) a
pointer to a compressed ALL,9 cache.log collected while
reproducing the
problem (using two transactions similar to the ones you have shared
below -- a successful stale hit and a problematic one):
https://wiki.squid-cache.org/SquidFaq/BugReporting#debugging-a-single-transaction <https://wiki.squid-cache.org/SquidFaq/BugReporting#debugging-a-single-transaction>
Alternatively, you can try to study cache.log yourself after
setting
debug_options to ALL,3. Searching for "refresh" and
"handleIMSReply" may
yield enough clues.
HTH,
Alex.
> # /etc/squid/squid.conf :
>
> acl to_aws dstdomain .amazonaws.com <http://amazonaws.com>
<http://amazonaws.com <http://amazonaws.com>>
>
> acl from_local src localhost
>
> http_access allow to_aws
>
> http_access allow from_local
>
> cache allow all
>
> cache_dir ufs /var/cache/squid 1024 16 256
>
> http_port 3129 ssl-bump cert=/etc/squid/maul.pem
> generate-host-certificates=on dynamic_cert_mem_cache_size=4MB
>
> sslcrtd_program /usr/lib/squid/security_file_certgen -s
> /var/lib/squid/ssl_db -M 4MB
>
> acl step1 at_step SslBump1
>
> ssl_bump bump step1
>
> ssl_bump bump all
>
> sslproxy_cert_error deny all
>
> cache_store_log stdio:/var/log/squid/store.log
>
> logfile_rotate 0
>
> shutdown_lifetime 3 seconds
>
>
> # /usr/bin/proxy-test :
>
> #!/bin/bash
>
> curl --proxy http://localhost:3129 <http://localhost:3129>
<http://localhost:3129 <http://localhost:3129>> \
>
> --cacert /etc/squid/stuff.pem \
>
> -v "https://stuff.amazonaws.com/api/v1/stuff/stuff.json
<https://stuff.amazonaws.com/api/v1/stuff/stuff.json>
> <https://stuff.amazonaws.com/api/v1/stuff/stuff.json
<https://stuff.amazonaws.com/api/v1/stuff/stuff.json>>" \
>
> -H "Authorization: token MYTOKEN" \
>
> -H "Content-Type: application/json" \
>
> --output "/tmp/stuff.json"
>
>
>
> Tests ..........
>
>
> At this point in time the network cable is unattached. Squid
returns
> the cached object it got when the network was online earlier.
The Age of
> this object is just still under the max_age of 3600.
Previously I
> was using offline_mode but I found that it did not try to
revalidate
> from the origin after the object expired (defined via max-age
response).
> My understanding is that stale-if-error should work under my
> circumstances.
>
>
> # /var/log/squid/access.log
>
> 1706799404.440 6 127.0.0.1 NONE_NONE/200 0 CONNECT
> stuff.amazonaws.com:443 <http://stuff.amazonaws.com:443>
<http://stuff.amazonaws.com:443
<http://stuff.amazonaws.com:443>> - HIER_NONE/- -
>
> 1706799404.440 0 127.0.0.1 TCP_MEM_HIT/200 20726 GET
> https://stuff.amazonaws.com/stuff.json
<https://stuff.amazonaws.com/stuff.json>
> <https://stuff.amazonaws.com/stuff.json
<https://stuff.amazonaws.com/stuff.json>> - HIER_NONE/-
application/json
>
>
> # extract from /usr/bin/proxy-test
>
> < HTTP/1.1 200 OK
>
> < Date: Thu, 01 Feb 2024 13:57:11 GMT
>
> < Content-Type: application/json
>
> < Content-Length: 20134
>
> < x-amzn-RequestId: 3a2d3b26-df73-4b30-88cb-1a9268fa0df2
>
> < Last-Modified: 2024-02-01T13:00:45.000Z
>
> < Access-Control-Allow-Origin: *
>
> < x-amz-apigw-id: SdZwpG7qiYcERUQ=
>
> < Cache-Control: public, max-age=3600, stale-if-error=31536000
>
> < ETag: "cec102b43372840737ab773c2e77858b"
>
> < X-Amzn-Trace-Id: Root=1-65bba337-292be751134161b03555cdd6
>
> < Age: 3573
>
> < X-Cache: HIT from labs-maul-st-31
>
> < X-Cache-Lookup: HIT from labs-maul-st-31:3129
>
> < Via: 1.1 labs-maul-st-31 (squid/5.7)
>
> < Connection: keep-alive
>
>
>
>
> Below .. the curl script executes again. The Age has gone
over the
> max-age so squid attempted to refresh from the origin. The
machine is
> still offline so the refresh failed. I expected that the
> stale-if-error response would instruct squid to return the
cached object
> as a 200.
>
>
> # /var/log/squid/access.log
>
> 1706799434.464 5 127.0.0.1 NONE_NONE/200 0 CONNECT
> stuff.amazonaws.com:443 <http://stuff.amazonaws.com:443>
<http://stuff.amazonaws.com:443
<http://stuff.amazonaws.com:443>> - HIER_NONE/- -
>
> 1706799434.464 0 127.0.0.1 TCP_REFRESH_FAIL_ERR/502 4235
GET
> https://stuff.amazonaws.com/stuff.json
<https://stuff.amazonaws.com/stuff.json>
> <https://stuff.amazonaws.com/stuff.json
<https://stuff.amazonaws.com/stuff.json>> - HIER_NONE/- text/html
>
>
> # extract from /usr/bin/proxy-test
>
> < HTTP/1.1 502 Bad Gateway
>
> < Server: squid/5.7
>
> < Mime-Version: 1.0
>
> < Date: Thu, 01 Feb 2024 14:57:14 GMT
>
> < Content-Type: text/html;charset=utf-8
>
> < Content-Length: 3853
>
> < X-Squid-Error: ERR_READ_ERROR 0
>
> < Vary: Accept-Language
>
> < Content-Language: en
>
> < X-Cache: MISS from labs-maul-st-31
>
> < X-Cache-Lookup: HIT from labs-maul-st-31:3129
>
> < Via: 1.1 labs-maul-st-31 (squid/5.7)
>
> < Connection: close
>
>
>
> Hope someone can help me with this. All the best,
>
>
> Robin Carlisle
>
>
>
> _______________________________________________
> squid-users mailing list
> squid-users@xxxxxxxxxxxxxxxxxxxxx
<mailto:squid-users@xxxxxxxxxxxxxxxxxxxxx>
> https://lists.squid-cache.org/listinfo/squid-users
<https://lists.squid-cache.org/listinfo/squid-users>
_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
<mailto:squid-users@xxxxxxxxxxxxxxxxxxxxx>
https://lists.squid-cache.org/listinfo/squid-users
<https://lists.squid-cache.org/listinfo/squid-users>
_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
https://lists.squid-cache.org/listinfo/squid-users