Do any of the squid experts have any answers for this? - Dave On Thu, Sep 25, 2008 at 02:04:09PM -0500, Dave Dykstra wrote: > I am running squid on over a thousand computers that are filtering data > coming out of one of the particle collision detectors on the Large > Hadron Collider. There are two origin servers, and the application > layer is designed to try the second server if the local squid returns a > 5xx HTTP code (server error). I just recently found that before squid > 2.7 this could never happen because squid would just return stale data > if the origin server was down (more precisely, I've been testing with > the server up but the listener process down so it gets 'connection > refused'). In squid 2.7STABLE4, if squid.conf has 'max_stale 0' or if > the origin server sends 'Cache-Control: must-revalidate' then squid will > send a 504 Gateway Timeout error. Unfortunately, this timeout error > does not get cached, and it gets sent upstream every time no matter what > negative_ttl is set to. These squids are configured in a hierarchy > where each feeds 4 others so loading gets spread out, but the fact that > the error is not cached at all means that if the primary origin server > is down, the squids near the top of the hierarchy will get hammered with > hundreds of requests for the server that's down before every request > that succeeds from the second server. > > Any suggestions? Is the fact that negative_ttl doesn't work with > max_stale a bug, a missing feature, or an unfortunate interpretation of > the HTTP 1.1 spec? > > By the way, I had hoped that 'Cache-Control: max-stale=0' would work the > same as squid.conf's 'max_stale 0' but I never see an error come back > when the origin server is down; it returns stale data instead. I wonder > if that's intentional, a bug, or a missing feature. I also note that > the HTTP 1.1 spec says that there MUST be a Warning 110 (Response is > stale) header attached if stale data is returned and I'm not seeing > those. > > - Dave