Sorry to top-post. Any more ideas with this? - Grant On Fri, Jul 5, 2013 at 12:33 AM, Grant <emailgrant@xxxxxxxxx> wrote: >>> I updated to 3.3.6 and the system doesn't become totally unresponsive >>> any more, although SSH latency is still pretty high when the client is >>> trying to load a page. The browser still hangs after loading a few >>> page elements on some websites (www.google.com/nexus/) but now if I >>> let the page load for long enough, it does eventually load, but it can >>> take 5 minutes or longer. Restarting squid sometimes makes it load a >>> lot faster. It's possible that it hangs more often when loading >>> elements from a different domain or subdomain (services.google.com, >>> doubleclick.net) but that could be a coincidence. The client's and >>> server's internet connections are strong. >> >> If that were related it might be DNS or TCP congestion (ECT, Window Scaling, >> MTU) issues. > > I set the following on the squid server and client with no noticeable change: > > echo 0 > /proc/sys/net/ipv4/tcp_ecn > echo 1 > /proc/sys/net/ipv4/ip_no_pmtu_disc > echo 0 > /proc/sys/net/ipv4/tcp_window_scaling > >>>> The usual cause of these type of issues is forwarding loops, although >>>> your >>>> low of socket usage indicates that is probably not the problem. >>> >>> Yes, I'm the only user. >> >> It might be related to the 10ms select-loop delays in Squid. If you load the >> proxy with a bunch more requests (say 20 in parallel constantly) does it >> still happen? > > I opened 20 tabs in firefox and the 3 tabs which started loading first > loaded slightly more content than usual. > >>> I have this on the squid system while the browser seems to hang so I >>> think there is plenty of available physical RAM: >>> >>> # free >>> total used free shared buffers cached >>> Mem: 1985944 1638368 347576 0 838340 219332 >>> -/+ buffers/cache: 580696 1405248 >>> Swap: 1048572 0 1048572 >>> >>> 2013/07/04 08:51:04.143 kid1| event.cc(250) checkEvents: checkEvents >>> 2013/07/04 08:51:04.143 kid1| AsyncCall.cc(18) AsyncCall: The >>> AsyncCall MaintainSwapSpace constructed, this=0xbb3310 [call546] >>> 2013/07/04 08:51:04.143 kid1| AsyncCall.cc(85) ScheduleCall: >>> event.cc(259) will call MaintainSwapSpace() [call546] >> >> These happening at regular but widely separated intervals? or lots across >> the slowdown period? > > They happen about once per second during the slowdown. > >> This is the main cache garbage collection operations, so should be checked >> and purge some things every so often. If they happen unuslally frequently >> during the slow-down period it means the cache is overflowing and CPU is >> busy purging contents until enough space is available for the new traffic. > > squid CPU usage is very low during the slowdown at .5% - 2.5% with > most of the CPU idle. I get 18M cache size every time I check: > > # du -sh /var/cache/squid > 18M /var/cache/squid > > I've tried each of these with no noticeable change: > > cache_dir ufs /var/cache/squid 100 16 256 > cache_dir aufs /var/cache/squid 100 16 256 > cache_dir diskd /var/cache/squid 100 16 256 > > - Grant