On 28/07/2014 10:35 a.m., Jason Haar wrote: > Hi there > > I'm seeing a reliability issue with squid-3.1.10 through 3.4.6 accessing > ipv6 sites. > > The root cause is that the ipv6 "Internet" is still a lot less reliable > than the ipv4 "Internet". Lots of sites seem to have a "flappy" > relationship with ipv6 which is not reflected in their ipv4 realm. This > of course has nothing to do with squid directly - but impacts it > > So the issue I'm seeing is going to some websites that have both ipv6 > and ipv4 addresses, ipv6 "working" (ie no immediate "no route" type > errors), but when squid tries to connect to the ipv6 address first, it > hangs so long on "down" sites that it times out and never gets around to > trying the working ipv4 address. It also doesn't appear to remember the > issue, so that it continues to be down (ie the ipv6 address that is down > for a website isn't cached to stop squid going there again [for a > timeframe]) > > Shouldn't squid just treat all ipv6 and ipv4 addresses assigned to a DNS > name in a "round robin" fashion, keeping track of which ones are > working? (I think it already does that with ipv4, I guess it isn't with > ipv6?). As per Subject line, I suspect squid needs a ipv6 timeout that > is shorter than the overall timeout, so that it will fallback on ipv4? No. round-robin IP connections from a proxy cause more problems than they solve. HTTP multiplexing / persistent connections, DNS behaviours, browser "happy eyeballs" algorithm are all involved or affected by the IP selection. A lot of applications use stateful sessions in the assumption that browser once found an IP will stick with it, so the best thing for Squid to do is the same. An IP is just an IP, regardless of version. Connectivity issues happen just as often in IPv4 as in IPv6 (more so when "carrier grade" NAT gets involved). The only special treatment IPv6 gets is sorting first by default ("dns_v4_first on" can change that) since 79% of networks today apparently have IPv6 connectivity operating faster by at least a 1ms than IPv4. It also avoids a bunch of potential issues with NAT and other IPv4-only middleware. Squid already does cache IP connectivity results. The problems are firstly, whenever DNS supplies new or updated IP information the connect tests have to be retried. Connection issues are quite commone even in IPv4 and usually temporary. Secondly that Squid timeouts (below) are not by default set to the right values to make the sites you noticed work very well. There are several limits which you can set in Squid to speed up or slow down the whole process: dns_timeout - for how long Squid will wait for DNS results. The default here is 30 seconds. If your DNS servers are highly reliable you can set that lower. ** If the problems sites are taking a long time to respond to AAAA queries this will greatly affect eth connection time. Setting this down closer to 10 sec can help for specific sites with fully broken DNS servers, but harms others which merely have slow DNS servers. YMMV, but I recommen checking the AAAA lookup speed for your specific problem sites before changing this. connect_timeout - for how long Squid waits for TCP SYN/SYN-ACK handshake to occur. The default here is a full minute. What you set this to depends on the Squid series: * In 3.1 and older this covered DNS lookup and a TCP handshakes for each IP address found by DNS. In these versions you increase the timeout to get better IPv6 failover behaviour. * In 3.2 and later this covers only one TCP handshake. In these versions you *decrease* it to improve performance. You can safely set it to a few seconds, but be aware of your Squid machines networking stack behaviour regarding TCP protocol retries and timeouts to determine what values will help or hurt [1] forward_max_retries - how many times Squid will attempt a full connect cycle (one connect_timeout). Default in stable releases is 10, squid-3.5 release is bumping this up to 25. What you set this to depends on the Squid series again, but as a side effect of connect_timeout changes. In all versions you can get better connectivity by increasing the value. For several of teh top-ten websites 25 is practically required just to get past the many IPv6 addresses they advertise and attempt any IPv4. forward_timeout - for how long in total Squid will attempt to connect to the servers (via all methods). The default here is 4 minutes. You can set it longer to allow automated systems better connectivity chances, but most people do not have that type of patience so 4 min before getting the "cannot connect" error page is probably a bit long already. You should not have to change this. > > i.e. right now I can't get to http://cs.co/ as their ipv6 address is > down, but their ipv4 address is up and working - but squid won't try it > because it hangs so long trying the ipv6 address (and on the flip-side, > www.google.com is working fine over ipv6). To put it another way, > squid-3.1.10 and newer work fine if the ipv6 address allocated to a site > is up and responding, but cause issues if it is not > cs.co seems to have fast DNS In general I recommend for you on a current (Squid-3.2 or later) releases: connect_timeout 5 seconds forward_max_retries 25 [1] Geof Huston has a useful column on how TCP retries affects "Happy eyeballs" software and IPv6 failover at <http://www.potaroo.net/ispcol/2012-05/notquite.html> Amos