What does the following give: uname -a While it's being slow, run the following to get some stats: vmstat 1 11 ;# Will run for 11 seconds iostat -dx 11 ;# Will run for 11 seconds, install sysstat if not found My first guess is memory swapping, but could be I/O. The above should help narrow it down. > -----Original Message----- > From: Felipe W Damasio [mailto:felipewd@xxxxxxxxx] > Sent: Monday, January 25, 2010 9:37 PM > To: squid-users@xxxxxxxxxxxxxxx > Subject: Squid performance issues > > Hi all, > > Sorry for the long email. > > I'm using squid on a 300Mbps ISP with about 10,000 users. > > I have an 8-core I7 Intel processor-machine, with 8GB of RAM and 500 > of HD for the cache. (exclusive Sata HD with xfs). Using aufs as > storeio. > > I'm caching mostly multimedia files (youtube and such). > > Squid usually eats around 50-70% of one core. > > But always around midnight (when a lot of users browse the internet), > my squid becomes very slow....I mean, a page that usually takes 0.04s > to load takes 23seconds to load. > > My best guess is that the volume of traffic is making squid slow. > > I'm using a 2.6.29.6 vanilla kernel with tproxy enabled for squid. > And I'm using these /proc configurations: > > echo 0 > /proc/sys/net/ipv4/tcp_ecn > echo 1 > /proc/sys/net/ipv4/tcp_low_latency > echo 100000 > /proc/sys/net/core/netdev_max_backlog > echo 409600 > /proc/sys/net/ipv4/tcp_max_syn_backlog > echo 7 > /proc/sys/net/ipv4/tcp_fin_timeout > echo 15 > /proc/sys/net/ipv4/tcp_keepalive_intvl > echo 3 > /proc/sys/net/ipv4/tcp_keepalive_probes > echo 65536 > /proc/sys/vm/min_free_kbytes > echo "262144 1024000 4194304" > /proc/sys/net/ipv4/tcp_rmem > echo "262144 1024000 4194304" > /proc/sys/net/ipv4/tcp_wmem > echo "1024000" > /proc/sys/net/core/rmem_max > echo "1024000" > /proc/sys/net/core/wmem_max > echo "512000" > /proc/sys/net/core/rmem_default > echo "512000" > /proc/sys/net/core/wmem_default > echo "524288" > /proc/sys/net/ipv4/netfilter/ip_conntrack_max > echo "3" > /proc/sys/net/ipv4/tcp_synack_retries > > The machine is in bridge-mode. > > I wrote a little script that prints: > > - The date; > - The "/usr/bin/time squidclient http://www.amazon.com"; > - The number of ESTABLISHED connections (through netstat -an); > - The number of TIME_WAIT connections; > - The total number of netstat connections; > - The route cache (ip route list cache); > - The number of clients currently connected in squid (through > mgr:info); > - The number of free memory in MB (free -m); > - The % used of the squid-running core; > - The average number of time to respond a request / sec (mgr:info > also) - 5 minutes avg; > - The average number of http requests / sec (5 minutes avg) - mgr:info > as well. > > On any other hour, I have something like: > > 2010-01-25 18:48:19 ; 0.04 ; 19383 ; 9902 ; 29865 ; 96972 ; 4677 ; 131 > ; 59 ; 0.24524 ; 476.871718 > 2010-01-25 18:53:29 ; 0.04 ; 18865 ; 8593 ; 30123 ; 179570 ; 4679 ; > 148 ; 62 ; 0.22004 ; 504.424207 > 2010-01-25 18:58:38 ; 0.04 ; 18377 ; 9056 ; 29283 ; 99038 ; 4680 ; 174 > ; 61 ; 0.22004 ; 466.659336 > 2010-01-25 19:03:49 ; 0.04 ; 18877 ; 9133 ; 28327 ; 181196 ; 4673 ; > 171 ; 57 ; 0.24524 ; 483.558436 > > So, it takes around 0.04s to get http://www.amazon.com. > > 2010-01-24 23:46:50 ; 2.53 ; 22723 ; 9861 ; 35012 ; 64752 ; 4306 ; > 166; 70 ; 0.22004 ; 566.364274 > 2010-01-24 23:52:04 ; 3.74 ; 21173 ; 10256 ; 33242 ; 167594 ; 4309 ; > 169 ; 68 ; 0.20843 ; 537.758601 > 2010-01-24 23:57:20 ; 0.08 ; 18691 ; 9050 ; 29590 ; 65496 ; 4312 ; 138 > ; 71 ; 0.20843 ; 525.119006 > 2010-01-25 00:02:29 ; 15.54 ; 18016 ; 8209 ; 29035 ; 149248 ; 4318 ; > 160 ; 82 ; 0.25890 ; 491.615241 > > As I said, it goes from 0.04 to 15.54s(!) to get a single html file. > Horrible. After 12:30, everything goes back to normal. > > From those variables, I can't seem to find any indication of what can > be causing this appalling slowdown. The number of squid users doesn't > go up that much, I just see that the avg time squid reports to > answering a request goes from 0.20s to 0.25, and the number of http > requests/sec actually goes down from 566 to 491...which is kind of odd > to me. And the number users using squid stays in aroung 4300. > > I talked to Mr. Dave Dykstra, and he thought it could be I/O delay > issues. So I tried: > > cache_dir null /tmp > cache_access_log none > cache_store_log none > > But no luck, on midnight tonight again things went wild: > > 2010-01-25 23:57:03 ; 0.04 ; 24112 ; 11330 ; 37240 ; 74456 ; 3516 ; > 160 ; 58 ; 0.25890 ; 581.047037 > 2010-01-26 00:02:15 ; 10.82 ; 25638 ; 11695 ; 38537 ; 177198 ; 3533 ; > 149 ; 78 ; 0.27332 ; 570.312936 > 2010-01-26 00:07:38 ; 42.64 ; 23818 ; 11563 ; 38097 ; 88902 ; 3556 ; > 171 ; 70 ; 0.30459 ; 585.880418 > > From 0.04 to 42 seconds to load the main html page of amazon.com. (!) > > Do you have any idea or any other data I can collect to try and > track down this? > > I'm using squid-2.7.stable7, but I'm willing to try squid-3.0 or > squid-3.1 if you guys think it could help. > > I'm using 2 gigabit Marvell Ethernet boards with sky2 driver. Don't > know if it's relevant, though. > > If you guys need any more info to try and help me figure this out, > please ask. > > I'm willing to test, code or do pretty much anything to make squid > perform better on my environment Please let me know how can I help you > help me. :-) > > Thanks! > > Felipe Damasio > > No virus found in this incoming message. > Checked by AVG - www.avg.com > Version: 8.5.432 / Virus Database: 271.1.1/2644 - Release Date: > 01/25/10 19:36:00