Search squid archive

Re: stale-if-error returning a 502

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2024-02-09 08:53, Robin Carlisle wrote:

I am trying the config workaround approach.

Please keep us posted on your progress.

 Below is the config snippet I have added.    I made the assumption that for the /refresh_pattern, max-stale=NN /config, the NN is in minutes as per the rest of that config directive.

That assumption is natural but incorrect: Unlike the anonymous positional min and max parameters (that use minutes), refresh_pattern max-stale=NN uses seconds. Documentation improvements are welcome.

Said that, the workaround should still prevent the application of the broken default refresh_pattern max-stale=0 rule, so you should still see positive results for the first NN seconds of the response age.

Instead of specifying max-stale=NN, consider adding refresh_pattern rules recommended by squid.conf.documented (and included in squid.cond.default). Those rules do not have max-stale options at all, and, hence, Squid will use (explicit or default) max_stale directive instead.

HTH,

Alex.


I am testing this right now

# this should allow stale objects up to 1 year if allowed by Cache-Control repsonseheaders ...

# ... setting both options just in case

max_stale 525600 minutes

refresh_pattern . 0  20% 4320 max-stale=525600


Thanks again for your help


Robin




On Thu, 8 Feb 2024 at 17:42, Alex Rousskov <rousskov@xxxxxxxxxxxxxxxxxxxxxxx <mailto:rousskov@xxxxxxxxxxxxxxxxxxxxxxx>> wrote:

    Hi Robin,

          AFAICT from the logs you have privately shared and your
    squid.conf
    that you have posted earlier, your Squid overwrites
    stale-if-error=31536000 in the response with "refresh_pattern
    max-stale=0" default. That 0 value is wrong. The correct value
    should be
    taken from max_stale directive that defaults to 1 week, not zero:

          refresh_pattern
          ...
          max-stale=NN provide a maximum staleness factor. Squid won't
          serve objects more stale than this even if it failed to
          validate the object. Default: use the max_stale global limit.

    This wrong default is a Squid bug AFAICT. I posted an _untested_ fix as
    Squid PR 1664: https://github.com/squid-cache/squid/pull/1664
    <https://github.com/squid-cache/squid/pull/1664>

    If possible, please test the corresponding patch:
    https://github.com/squid-cache/squid/commit/571973589b5a46d458311f8b60dcb83032fd5cec.patch <https://github.com/squid-cache/squid/commit/571973589b5a46d458311f8b60dcb83032fd5cec.patch>

    AFAICT, you can also work around that bug by configuring an explicit
    refresh_pattern rule with an explicit max-stale option (see
    squid.conf.documented for examples). I have not tested that theory
    either.


    HTH,

    Alex.


    On 2024-02-07 13:45, Robin Carlisle wrote:
     > Hi,
     >
     > I have just started my enhanced logging journey and have a small
    snippet
     > below that might illuminate the issue ...
     >
     > /2024/02/07 17:06:39.212 kid1| 88,3| client_side_reply.cc(507)
     > handleIMSReply: origin replied with error 502, forwarding to
    client due
     > to fail_on_validation_err/
     >
     > A few lines below in the log it looks like squid sent :-
     >
     > /2024/02/07 17:06:39.212 kid1| 11,2| Stream.cc(280)
    sendStartOfMessage:
     > HTTP Client REPLY:
     > ---------
     > HTTP/1.1 502 Bad Gateway
     > Server: squid/5.7
     > Mime-Version: 1.0
     > Date: Wed, 07 Feb 2024 17:06:39 GMT
     > Content-Type: text/html;charset=utf-8
     > Content-Length: 3853
     > X-Squid-Error: ERR_READ_ERROR 0
     > Vary: Accept-Language
     > Content-Language: en
     > X-Cache: MISS from labs-maul-st-15
     > X-Cache-Lookup: HIT from labs-maul-st-15:3129
     > Via: 1.1 labs-maul-st-15 (squid/5.7)
     > Connection: close/
     >
     >
     > The rest of the logs are quite large and contain URLs I cannot put
     > here.   The logs were generated with debug_options to ALL,3.
     >
     > Any ideas?   Or should I generate more detailed logs and send them
     > privately?
     >
     > Thanks again,
     >
     > Robin
     >
     >
     >
     >
     > On Fri, 2 Feb 2024 at 11:20, Robin Carlisle
     > <robin.carlisle@xxxxxxxxxxxxxx
    <mailto:robin.carlisle@xxxxxxxxxxxxxx>
    <mailto:robin.carlisle@xxxxxxxxxxxxxx
    <mailto:robin.carlisle@xxxxxxxxxxxxxx>>>
     > wrote:
     >
     >     Hi, thanks for your reply.
     >
     >     I have been looking at :
     >
    https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control <https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control> <https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control <https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control>>
     >
     >     /The stale-if-error response directive indicates that the
    cache can
     >     reuse a stale response when an upstream server generates an
    error,
     >     or when the error is generated locally. Here, an error is
    considered
     >     any response with a status code of 500, 502, 503, or 504.
     >
     >     Cache-Control: max-age=604800, stale-if-error=86400
     >     In the example above, the response is fresh for 7 days (604800s).
     >     Afterwards, it becomes stale, but can be used for an extra 1 day
     >     (86400s) when an error is encountered.
     >
     >     After the stale-if-error period passes, the client will
    receive any
     >     error generated/
     >
     >     Given what you have said and what the above docs say - I am still
     >     confused as it looks like (in my test cases) the cached
    response can
     >     be used for 3600 secs (this works), after which the cached
    response
     >     can still be used for an additional 31536000 seconds on an error
     >     (this doesnt work).
     >
     >     I am going to dig into the error logging you suggested to see
    if I
     >     can make sense of that - and will send on if I can't.
     >
     >     Thanks v much for your help again,
     >
     >     Robin
     >
     >
     >
     >
     >
     >     On Thu, 1 Feb 2024 at 18:27, Alex Rousskov
     >     <rousskov@xxxxxxxxxxxxxxxxxxxxxxx
    <mailto:rousskov@xxxxxxxxxxxxxxxxxxxxxxx>
     >     <mailto:rousskov@xxxxxxxxxxxxxxxxxxxxxxx
    <mailto:rousskov@xxxxxxxxxxxxxxxxxxxxxxx>>> wrote:
     >
     >         On 2024-02-01 12:03, Robin Carlisle wrote:
     >          > Hi, I am having trouble with stale-if-error response.
     >
     >         If I am interpreting Squid code correctly, in primary use
    cases:
     >
     >         * without a Cache-Control:stale-if-error=X in the original
     >         response,
     >         Squid sends a stale object if revalidation results in a
    5xx error;
     >
     >         * with a Cache-Control:stale-if-error=X and object age at
    most
     >         X, Squid
     >         sends a stale object if revalidation results in a 5xx error;
     >
     >         * with a Cache-Control:stale-if-error=X and object age
    exceeding X,
     >         Squid forwards the 5xx error response if revalidation
    results in
     >         a 5xx
     >         error;
     >
     >         In other words, stale-if-error=X turns on a "fail on
    validation
     >         errors"
     >         behavior for stale objects older than X. It has no other
    effects.
     >
     >         In your test case, the stale objects are much younger than
     >         stale-if-error value (e.g., Age~=3601 vs.
    stale-if-error=31536000).
     >         Thus, stale-if-error should have no relevant effect.
     >
     >         Something else is probably preventing your Squid from serving
     >         the stale
     >         response when facing a 5xx error. I do not know what that
     >         something is.
     >
     >         I recommend sharing (privately if you need to protect
    sensitive
     >         info) a
     >         pointer to a compressed ALL,9 cache.log collected while
     >         reproducing the
     >         problem (using two transactions similar to the ones you
    have shared
     >         below -- a successful stale hit and a problematic one):
     >
    https://wiki.squid-cache.org/SquidFaq/BugReporting#debugging-a-single-transaction <https://wiki.squid-cache.org/SquidFaq/BugReporting#debugging-a-single-transaction> <https://wiki.squid-cache.org/SquidFaq/BugReporting#debugging-a-single-transaction <https://wiki.squid-cache.org/SquidFaq/BugReporting#debugging-a-single-transaction>>
     >
     >         Alternatively, you can try to study cache.log yourself after
     >         setting
     >         debug_options to ALL,3. Searching for "refresh" and
     >         "handleIMSReply" may
     >         yield enough clues.
     >
     >
     >         HTH,
     >
     >         Alex.
     >
     >
     >
     >
     >          > # /etc/squid/squid.conf :
     >          >
     >          > acl to_aws dstdomain .amazonaws.com
    <http://amazonaws.com> <http://amazonaws.com <http://amazonaws.com>>
     >         <http://amazonaws.com <http://amazonaws.com>
    <http://amazonaws.com <http://amazonaws.com>>>
     >          >
     >          > acl from_local src localhost
     >          >
     >          > http_access allow to_aws
     >          >
     >          > http_access allow from_local
     >          >
     >          > cache allow all
     >          >
     >          > cache_dir ufs /var/cache/squid 1024 16 256
     >          >
     >          > http_port 3129 ssl-bump cert=/etc/squid/maul.pem
     >          > generate-host-certificates=on
    dynamic_cert_mem_cache_size=4MB
     >          >
     >          > sslcrtd_program /usr/lib/squid/security_file_certgen -s
     >          > /var/lib/squid/ssl_db -M 4MB
     >          >
     >          > acl step1 at_step SslBump1
     >          >
     >          > ssl_bump bump step1
     >          >
     >          > ssl_bump bump all
     >          >
     >          > sslproxy_cert_error deny all
     >          >
     >          > cache_store_log stdio:/var/log/squid/store.log
     >          >
     >          > logfile_rotate 0
     >          >
     >          > shutdown_lifetime 3 seconds
     >          >
     >          >
     >          > # /usr/bin/proxy-test :
     >          >
     >          > #!/bin/bash
     >          >
     >          > curl --proxy http://localhost:3129
    <http://localhost:3129> <http://localhost:3129 <http://localhost:3129>>
     >         <http://localhost:3129 <http://localhost:3129>
    <http://localhost:3129 <http://localhost:3129>>> \
     >          >
     >          >    --cacert /etc/squid/stuff.pem \
     >          >
     >          >    -v
    "https://stuff.amazonaws.com/api/v1/stuff/stuff.json
    <https://stuff.amazonaws.com/api/v1/stuff/stuff.json>
     >         <https://stuff.amazonaws.com/api/v1/stuff/stuff.json
    <https://stuff.amazonaws.com/api/v1/stuff/stuff.json>>
     >          > <https://stuff.amazonaws.com/api/v1/stuff/stuff.json
    <https://stuff.amazonaws.com/api/v1/stuff/stuff.json>
     >         <https://stuff.amazonaws.com/api/v1/stuff/stuff.json
    <https://stuff.amazonaws.com/api/v1/stuff/stuff.json>>>" \
     >          >
     >          >    -H "Authorization: token MYTOKEN" \
     >          >
     >          >    -H "Content-Type: application/json" \
     >          >
     >          >    --output "/tmp/stuff.json"
     >          >
     >          >
     >          >
     >          > Tests  ..........
     >          >
     >          >
     >          > At this point in time the network cable is
    unattached.  Squid
     >         returns
     >          > the cached object it got when the network was online
    earlier.
     >         The Age of
     >          > this object is just still under the max_age of 3600.
     >         Previously I
     >          > was using offline_mode but I found that it did not try to
     >         revalidate
     >          > from the origin after the object expired (defined via
    max-age
     >         response).
     >          >    My understanding is that stale-if-error should work
    under my
     >          > circumstances.
     >          >
     >          >
     >          > # /var/log/squid/access.log
     >          >
     >          > 1706799404.440      6 127.0.0.1 NONE_NONE/200 0 CONNECT
     >          > stuff.amazonaws.com:443
    <http://stuff.amazonaws.com:443> <http://stuff.amazonaws.com:443
    <http://stuff.amazonaws.com:443>>
     >         <http://stuff.amazonaws.com:443
    <http://stuff.amazonaws.com:443>
     >         <http://stuff.amazonaws.com:443
    <http://stuff.amazonaws.com:443>>> - HIER_NONE/- -
     >          >
     >          > 1706799404.440      0 127.0.0.1 TCP_MEM_HIT/200 20726 GET
     >          > https://stuff.amazonaws.com/stuff.json
    <https://stuff.amazonaws.com/stuff.json>
     >         <https://stuff.amazonaws.com/stuff.json
    <https://stuff.amazonaws.com/stuff.json>>
     >          > <https://stuff.amazonaws.com/stuff.json
    <https://stuff.amazonaws.com/stuff.json>
     >         <https://stuff.amazonaws.com/stuff.json
    <https://stuff.amazonaws.com/stuff.json>>> - HIER_NONE/-
     >         application/json
     >          >
     >          >
     >          > # extract from /usr/bin/proxy-test
     >          >
     >          > < HTTP/1.1 200 OK
     >          >
     >          > < Date: Thu, 01 Feb 2024 13:57:11 GMT
     >          >
     >          > < Content-Type: application/json
     >          >
     >          > < Content-Length: 20134
     >          >
     >          > < x-amzn-RequestId: 3a2d3b26-df73-4b30-88cb-1a9268fa0df2
     >          >
     >          > < Last-Modified: 2024-02-01T13:00:45.000Z
     >          >
     >          > < Access-Control-Allow-Origin: *
     >          >
     >          > < x-amz-apigw-id: SdZwpG7qiYcERUQ=
     >          >
     >          > < Cache-Control: public, max-age=3600,
    stale-if-error=31536000
     >          >
     >          > < ETag: "cec102b43372840737ab773c2e77858b"
     >          >
     >          > < X-Amzn-Trace-Id:
    Root=1-65bba337-292be751134161b03555cdd6
     >          >
     >          > < Age: 3573
     >          >
     >          > < X-Cache: HIT from labs-maul-st-31
     >          >
     >          > < X-Cache-Lookup: HIT from labs-maul-st-31:3129
     >          >
     >          > < Via: 1.1 labs-maul-st-31 (squid/5.7)
     >          >
     >          > < Connection: keep-alive
     >          >
     >          >
     >          >
     >          >
     >          > Below .. the curl script executes again.  The Age has gone
     >         over the
     >          > max-age so squid attempted to refresh from the
    origin.  The
     >         machine is
     >          > still offline so the refresh failed.   I expected that the
     >          > stale-if-error response would instruct squid to return the
     >         cached object
     >          > as a 200.
     >          >
     >          >
     >          > # /var/log/squid/access.log
     >          >
     >          > 1706799434.464      5 127.0.0.1 NONE_NONE/200 0 CONNECT
     >          > stuff.amazonaws.com:443
    <http://stuff.amazonaws.com:443> <http://stuff.amazonaws.com:443
    <http://stuff.amazonaws.com:443>>
     >         <http://stuff.amazonaws.com:443
    <http://stuff.amazonaws.com:443>
     >         <http://stuff.amazonaws.com:443
    <http://stuff.amazonaws.com:443>>> - HIER_NONE/- -
     >          >
     >          > 1706799434.464      0 127.0.0.1
    TCP_REFRESH_FAIL_ERR/502 4235
     >         GET
     >          > https://stuff.amazonaws.com/stuff.json
    <https://stuff.amazonaws.com/stuff.json>
     >         <https://stuff.amazonaws.com/stuff.json
    <https://stuff.amazonaws.com/stuff.json>>
     >          > <https://stuff.amazonaws.com/stuff.json
    <https://stuff.amazonaws.com/stuff.json>
     >         <https://stuff.amazonaws.com/stuff.json
    <https://stuff.amazonaws.com/stuff.json>>> - HIER_NONE/- text/html
     >          >
     >          >
     >          > # extract from /usr/bin/proxy-test
     >          >
     >          > < HTTP/1.1 502 Bad Gateway
     >          >
     >          > < Server: squid/5.7
     >          >
     >          > < Mime-Version: 1.0
     >          >
     >          > < Date: Thu, 01 Feb 2024 14:57:14 GMT
     >          >
     >          > < Content-Type: text/html;charset=utf-8
     >          >
     >          > < Content-Length: 3853
     >          >
     >          > < X-Squid-Error: ERR_READ_ERROR 0
     >          >
     >          > < Vary: Accept-Language
     >          >
     >          > < Content-Language: en
     >          >
     >          > < X-Cache: MISS from labs-maul-st-31
     >          >
     >          > < X-Cache-Lookup: HIT from labs-maul-st-31:3129
     >          >
     >          > < Via: 1.1 labs-maul-st-31 (squid/5.7)
     >          >
     >          > < Connection: close
     >          >
     >          >
     >          >
     >          > Hope someone can help me with this.  All the best,
     >          >
     >          >
     >          > Robin Carlisle
_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
https://lists.squid-cache.org/listinfo/squid-users




[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux