Re: Squid 2.6-STABLE6 100% cpu load, lots of iowait and coredumps

Chris Robertson <crobertson@xxxxxxx> · Fri, 12 Jan 2007 14:45:13 -0900

Jason Taylor wrote:
We recently upgraded our extremely old squid servers (P3-1GHz, 512MB 
ram) to modern hardware (3GHz P4, 2GB ram) running RedHat linux.  The 
problem I am experiencing is that CPU load will jump to 100% and stay 
there.  The system shows about 50% iowait cpu usage.  We have about 
1500 employees using two such machines.  Also, squid seems to dump 
core and restart several times during all this.  My coredump_dir 
directory has many core files in it, about 10 or 11 per day when the 
load-spike happens.  Some days, things are just fine.

I haven't had much time to validate all of this as I am just back from 
my holiday and this is what I see on the systems for now.  At the 
moment, the machines are playing nice, likely because the bulk of 
user-load has been switched back to the old proxies for the time being.

Disk-wise, the servers each have two 146GB scsi disks in a hardware 
mirror with all space (except /boot and swap) allocated to "/".  Not 
ideal, but I am not the one responsible for the hardware config.  The 
filesystem is ext3.

Here is the output of df -k:
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda3            139707024  16431300 116179012  13% /
/dev/sda1               256666     16053    227361   7% /boot
none                   1037388         0   1037388   0% /dev/shm

Here is the output of uname -a:
Linux spco1pxya 2.6.9-34.ELsmp #1 SMP Fri Feb 24 16:54:53 EST 2006 
i686 i686 i386 GNU/Linux

Here is how squid-2.6-STABLE6 was built:
./configure                                     \
  --enable-delay-pools                          \
  --enable-useragent-log                        \
  --enable-referer-log                          \
  --disable-wccp                                \
  --enable-err-languages="English French"       \
  --enable-linux-netfilter                      \
  --disable-ident-lookups                       \
  --enable-auth="basic digest ntlm"             \
  --enable-basic-auth-helpers="getpwnam LDAP MSNT multi-domain-NTLM 
NCSA SASL SMB"      \
  --enable-ntlm-auth-helpers="fakeauth no_check 
SMB"                                    \
  --enable-digest-auth-helpers="ldap 
password"                                          \
  --enable-external-acl-helpers="ip_user ldap_group session unix_group 
wbinfo_group"    \
  --with-large-files

Here is the excerpt from the squid-conf pertaining to cache-size:
cache_dir    ufs /var/squid/cache 2048 16 256

I'd advise using aufs, as it's non-blocking.  This will require a 
recompile, with either --enable-async-io or  --enable-storeio="ufs, 
aufs".  Another great CPU hog is the regex ACLs.  Avoid them whenever 
possible.

Even so, Squid shouldn't be crashing.

What configurations can I do to my OS and squid in order to get rid of 
this bottleneck?

Cheers,

/Jason

Chris