> On 2014-02-21 06:10, Simon Beale wrote: >> I've got a problem at the moment with our general squid proxies where >> occasionally requests take a long time that shouldn't do. (i.e. 5+ >> seconds >> or timeout, instead of milliseconds). >> >> This is most common on our proxies doing 100 reqs/sec, but happens >> overnight too when they're running at 10 reqs/sec. I've got this >> happening >> with both v3.4.2 and also with a box I've downgraded back to v3.1.10. >> For >> v3.4.2, it's happening in both multiple worker and single worker modes. As a follow up, we've narrowed this down to the internal DNS resolver. When I deploy a 3.4.2 (which is what we're running elsewhere) that's been recompiled with "--disable-internal-dns", the problem completely goes away. > What sort of CPU loading do you have at ~100req/sec? > is that at or near your local installations req/sec capacity? For the box running with a single worker, it consumes 50% of one core at 100 req/sec. For the boxes running with 9 workers, each worker consumes 5% of a core at the same rate. >> The test is not reproducible, sadly, but I've got a cronjob running on >> localhost on these boxes testing access times to various URLs covering: >> HTTPS, non-HTTPS static content, using IP not hostname over both HTTP >> and >> HTTPS, and a URL on the same vlan as the proxies. All of these test >> cases >> have it happen occasionally, but not repeatedly/reliably. > > Some ideas: > * DNS lookup delays ? Yeah, when I enabled the dns resolution time logging in squid, that became apparent. Quite why the internal dns resolver shows this, but the external one doesn't, I don't know. The DNS server query logs show both DNS servers in /etc/resolv.conf getting the request in turn and answering it (though 5 seconds apart). It's happening for us in multiple datacentres, so is unlikely to be port errors or internal packet loss. It's only(/mostly?) apparent on our squid servers that do desktop proxying, so do lots of DNS requests to everywhere; the squid servers that handle just our datacentre servers don't show this problem, but only really go to about 40 hosts in total. Thanks Simon