Hello, This Request For Comments proposes to remove a subtle Squid (mis)feature. If you happen to use the feature detailed below or know somebody who does, please speak up to protect it! If nobody defends this feature, we may remove it (to get rid of its bad side effects). If you use cache_peers, you know that when a peer cannot be reached, Squid tries a few times (see cache_peer connect-fail-limit; default is 10 times) and then declares the peer "dead". For example: > 2017/03/21 10:11:46.380| TCP connection to 127.0.0.4/80 failed > 2017/03/21 10:11:46.380| TCP connection to 127.0.0.4/80 failed ... > 2017/03/21 10:11:46.394| TCP connection to 127.0.0.4/80 failed > 2017/03/21 10:11:46.394| Detected DEAD Parent: peer4 Normally, Squid does not forward HTTP transactions to dead peers because doing so is likely to cause timeouts and other problems. Squid has mechanisms that detect revived (i.e., no longer dead) peers without sending regular HTTP requests to peers considered dead. One such mechanism is TCP probes that check whether opening a TCP connection to the dead peer started to work. There are several problems with dead peer handling, and we are working on fixing some of them, but this RFC focuses on one specific feature: * Squid may forward an HTTP request to an otherwise eligible but dead peer that was idle[1] for some time[2]. This "use idle dead peer" feature was introduced as a small part of a much bigger bug #14 fix. AFAICT, the stated goal of the feature was speeding up failure recovery: > revno: 6631 > timestamp: Sat 2004-04-03 21:07:38 +0000 > message: > Bug #14: connection setup may look like syn flood attack if server is > refusing connection > > If the contacted server refuses connection then the repeated attempts to > connect to the server may look like a syn flood attack. This patch makes > Squid behave a little friendler in such case and: > ... > * Cleanup of peer TCP probing to correct timeout management etc and to > more promptly recover after a failure. The "more promptly recover after a failure" phrase probably refers to the elimination of a single TCP connect(2) peer usage delay or, to be more precise, the delay between the following two events: * Start: An HTTP transaction initiates a background TCP connect probe (but is not sent to the dead idle peer). * Finish: A successful result of a TCP probe initiated above (allowing future transactions to use the revived peer). AFAICT, the feature justification/logic goes something like this: If there were no failures for a while then perhaps the peer is not dead anymore. Let's try using it for the current HTTP transaction and see what happens. If we are lucky, we will start using the peer sooner! Since the lack of failures does not imply success, the feature may lead to regular HTTP client transactions being sent to a truly dead peer. Such transactions may experience delays (at best) or client disconnects/errors (at worst), depending on Squid and client configurations/state. IMO, Squid should not risk regular HTTP transactions this way, and the actual benefits of such risks are slim in most environments. Thus, we should remove this feature and simply let existing TCP probes to revive dead peers. This feature removal does not increase the number of TCP probes. This feature removal does not delay HTTP transactions as such (it only delays the time when Squid can resume peer usage). Does anybody need this "use idle dead peers" feature? [1] Here, "idle" essentially means a peer that Squid did not probe or otherwise contact for a while[2]. Peers become idle if they are not selected by peering algorithms as potential forwarding destinations (e.g., a dead round-robin parent with very low weight is likely to become idle even if its "heavy" cousins remain very busy). [2] The inactivity time associated with becoming idle is calculated as ten times the peer_connect_timeout (or ten times cache_peer connect-timeout when set). It defaults to 10*30 seconds or 5 minutes. Thank you, Alex. P.S. Please resist the temptation to discuss other peering problems on this thread, including other problems associated with detection and revival of dead peers. Let's focus on this specific feature proposed for removal. _______________________________________________ squid-users mailing list squid-users@xxxxxxxxxxxxxxxxxxxxx http://lists.squid-cache.org/listinfo/squid-users