> On 30 Aug 2019, at 16.40, Alex Rousskov <rousskov@xxxxxxxxxxxxxxxxxxxxxxx> wrote: > > On 8/30/19 8:16 AM, Ilari Laitinen wrote: > >> I suspect Squid might be waiting for local TCP ports from the kernel >> (or something related). > > IIRC, ephemeral source port allocator is instantaneous -- Squid either > gets a port or a port allocation error, without waiting. When we > overload the server with high-performance tests (without an explicit > port manager), we see port allocation errors rather than stalled tests. > However, perhaps that is not true in your OS/environment. Ah yes, of course. I actually saw this error first-hand when setting up the test environment. >> Right now, there are four different IP addresses returned for the >> target cloud service. For practical purposes, they are returned in a >> random order. The traffic would ideally be spread over all of them. >> Unfortunately it is evident both from the debug log and from the TCP >> dump that Squid is using only one of the addresses at a time. The >> amount of connections in the TIME_WAIT state for that single IP >> address gets very close to the maximum defined by the >> net.ipv4.ip_local_port_range sysctl. After a while (a minute or so in >> the recording) this address changes presumably in response to a new >> DNS query result. > > In theory, Squid should round-robin across all destination IP addresses > for a single host name. If your Squid v3 does not, it is probably a > Squid bug that can be fixed [by upgrading]. > > Said that, IIRC, the notion of "round robin" is rather vague in Squid > because there are several places where an IP may be requested for the > same host name inside the same transaction. I would not be surprised if > that low-level round-robin behavior results in the same IP being used > for most transactions in some environments (until an error or a new DNS > query reshuffles the IPs). Debugging logs may expose this problem. I looked into this further. All our tcp dumps so far show the same: Squid uses (almost always) exactly one remote address at a time. The increasing delays start right after Squid has switched to a new remote IP and last precisely until another switch happens (typically the next one). The problem does not occur every time and is not limited to a single target IP. Now that I know that this is not expected and is possibly related to a bug, I’ll look into upgrading Squid from the platform default. >> One possible workaround that I can think of is setting a short >> positive_dns_ttl, but this doesn’t fully guarantee an even >> distribution, now does it? > > No, it does not. Moreover, Squid v3 had some TTL handling bugs that were > fixed (in v4 and later code) by the Happy Eyeballs project. Taking all > the known problems into the account, it is difficult for me to predict > the effect of changing TTLs. Said that, it does not hurt to try! Maybe > you will be lucky, and a simple configuration change will remove the > cause of increasing transaction delays. Thank you very much for your informed and timely replies! I’ll report our results here. Best, -- Ilari Laitinen _______________________________________________ squid-users mailing list squid-users@xxxxxxxxxxxxxxxxxxxxx http://lists.squid-cache.org/listinfo/squid-users