RE: Squid performance issues

"John Lauro" <john.lauro@xxxxxxxxxxxxxxxx> · Mon, 25 Jan 2010 22:00:25 -0500

What does the following give:
uname -a

While it's being slow, run the following to get some stats:

vmstat 1 11     ;# Will run for 11 seconds
iostat -dx 11   ;# Will run for 11 seconds, install sysstat if not found

My first guess is memory swapping, but could be I/O.  The above should help
narrow it down.

> -----Original Message-----
> From: Felipe W Damasio [mailto:felipewd@xxxxxxxxx]
> Sent: Monday, January 25, 2010 9:37 PM
> To: squid-users@xxxxxxxxxxxxxxx
> Subject:  Squid performance issues
> 
>  Hi all,
> 
>  Sorry for the long email.
> 
>  I'm using squid on a 300Mbps ISP with about 10,000 users.
> 
>  I have an 8-core I7 Intel processor-machine, with 8GB of RAM and 500
> of HD for the cache. (exclusive Sata HD with xfs). Using aufs as
> storeio.
> 
>  I'm caching mostly multimedia files (youtube and such).
> 
>  Squid usually eats around 50-70% of one core.
> 
>  But always around midnight (when a lot of users browse the internet),
> my squid becomes very slow....I mean, a page that usually takes 0.04s
> to load takes 23seconds to load.
> 
>  My best guess is that the volume of traffic is making squid slow.
> 
>  I'm using a 2.6.29.6 vanilla kernel with tproxy enabled for squid.
> And I'm using these /proc configurations:
> 
> echo 0 > /proc/sys/net/ipv4/tcp_ecn
> echo 1 > /proc/sys/net/ipv4/tcp_low_latency
> echo 100000 > /proc/sys/net/core/netdev_max_backlog
> echo 409600  > /proc/sys/net/ipv4/tcp_max_syn_backlog
> echo 7 > /proc/sys/net/ipv4/tcp_fin_timeout
> echo 15 > /proc/sys/net/ipv4/tcp_keepalive_intvl
> echo 3 > /proc/sys/net/ipv4/tcp_keepalive_probes
> echo 65536 > /proc/sys/vm/min_free_kbytes
> echo "262144 1024000 4194304" > /proc/sys/net/ipv4/tcp_rmem
> echo "262144 1024000 4194304" > /proc/sys/net/ipv4/tcp_wmem
> echo "1024000" > /proc/sys/net/core/rmem_max
> echo "1024000" > /proc/sys/net/core/wmem_max
> echo "512000" > /proc/sys/net/core/rmem_default
> echo "512000" > /proc/sys/net/core/wmem_default
> echo "524288" > /proc/sys/net/ipv4/netfilter/ip_conntrack_max
> echo "3" > /proc/sys/net/ipv4/tcp_synack_retries
> 
>  The machine is in bridge-mode.
> 
>  I wrote a little script that prints:
> 
>  - The date;
>  - The "/usr/bin/time squidclient http://www.amazon.com";;
>  - The number of ESTABLISHED connections (through netstat -an);
>  - The number of TIME_WAIT connections;
>  - The total number of netstat connections;
>  - The route cache (ip route list cache);
>  - The number of clients currently connected in squid (through
> mgr:info);
>  - The number of free memory in MB (free -m);
>  - The % used of the squid-running core;
>  - The average number of time to respond a request / sec (mgr:info
> also) - 5 minutes avg;
>  - The average number of http requests / sec (5 minutes avg) - mgr:info
> as well.
> 
>  On any other hour, I have something like:
> 
> 2010-01-25 18:48:19 ; 0.04 ; 19383 ; 9902 ; 29865 ; 96972 ; 4677 ; 131
> ; 59 ; 0.24524 ; 476.871718
> 2010-01-25 18:53:29 ; 0.04 ; 18865 ; 8593 ; 30123 ; 179570 ; 4679 ;
> 148 ; 62 ; 0.22004 ; 504.424207
> 2010-01-25 18:58:38 ; 0.04 ; 18377 ; 9056 ; 29283 ; 99038 ; 4680 ; 174
> ; 61 ; 0.22004 ; 466.659336
> 2010-01-25 19:03:49 ; 0.04 ; 18877 ; 9133 ; 28327 ; 181196 ; 4673 ;
> 171 ; 57 ; 0.24524 ; 483.558436
> 
>  So, it takes around 0.04s to get http://www.amazon.com.
> 
> 2010-01-24 23:46:50 ; 2.53 ; 22723 ; 9861 ; 35012 ; 64752 ; 4306 ;
> 166; 70 ; 0.22004 ; 566.364274
> 2010-01-24 23:52:04 ; 3.74 ; 21173 ; 10256 ; 33242 ; 167594 ; 4309 ;
> 169 ; 68 ; 0.20843 ; 537.758601
> 2010-01-24 23:57:20 ; 0.08 ; 18691 ; 9050 ; 29590 ; 65496 ; 4312 ; 138
> ; 71 ; 0.20843 ; 525.119006
> 2010-01-25 00:02:29 ; 15.54 ; 18016 ; 8209 ; 29035 ; 149248 ; 4318 ;
> 160 ; 82 ; 0.25890 ; 491.615241
> 
>  As I said, it goes from 0.04 to 15.54s(!) to get a single html file.
> Horrible. After 12:30, everything goes back to normal.
> 
>  From those variables, I can't seem to find any indication of what can
> be causing this appalling slowdown. The number of squid users doesn't
> go up that much, I just see that the avg time squid reports to
> answering a request goes from 0.20s to 0.25, and the number of http
> requests/sec actually goes down from 566 to 491...which is kind of odd
> to me. And the number users using squid stays in aroung 4300.
> 
>  I talked to Mr. Dave Dykstra, and he thought it could be I/O delay
> issues. So I tried:
> 
> cache_dir null /tmp
> cache_access_log none
> cache_store_log none
> 
>   But no luck, on midnight tonight again things went wild:
> 
> 2010-01-25 23:57:03 ; 0.04 ; 24112 ; 11330 ; 37240 ; 74456 ; 3516 ;
> 160 ; 58 ; 0.25890 ; 581.047037
> 2010-01-26 00:02:15 ; 10.82 ; 25638 ; 11695 ; 38537 ; 177198 ; 3533 ;
> 149 ; 78 ; 0.27332 ; 570.312936
> 2010-01-26 00:07:38 ; 42.64 ; 23818 ; 11563 ; 38097 ; 88902 ; 3556 ;
> 171 ; 70 ; 0.30459 ; 585.880418
> 
>   From 0.04 to 42 seconds to load the main html page of amazon.com. (!)
> 
>   Do you have any idea or any other data I can collect to try and
> track down this?
> 
>   I'm using squid-2.7.stable7, but I'm willing to try squid-3.0 or
> squid-3.1 if you guys think it could help.
> 
>   I'm using 2 gigabit Marvell Ethernet boards with sky2 driver. Don't
> know if it's relevant, though.
> 
>   If you guys need any more info to try and help me figure this out,
> please ask.
> 
>   I'm willing to test, code or do pretty much anything to make squid
> perform better on my environment Please let me know how can I help you
> help me. :-)
> 
>   Thanks!
> 
> Felipe Damasio
> 
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 8.5.432 / Virus Database: 271.1.1/2644 - Release Date:
> 01/25/10 19:36:00