Jason Taylor wrote:
We recently upgraded our extremely old squid servers (P3-1GHz, 512MB
ram) to modern hardware (3GHz P4, 2GB ram) running RedHat linux. The
problem I am experiencing is that CPU load will jump to 100% and stay
there. The system shows about 50% iowait cpu usage. We have about
1500 employees using two such machines. Also, squid seems to dump
core and restart several times during all this. My coredump_dir
directory has many core files in it, about 10 or 11 per day when the
load-spike happens. Some days, things are just fine.
I haven't had much time to validate all of this as I am just back from
my holiday and this is what I see on the systems for now. At the
moment, the machines are playing nice, likely because the bulk of
user-load has been switched back to the old proxies for the time being.
Disk-wise, the servers each have two 146GB scsi disks in a hardware
mirror with all space (except /boot and swap) allocated to "/". Not
ideal, but I am not the one responsible for the hardware config. The
filesystem is ext3.
Here is the output of df -k:
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda3 139707024 16431300 116179012 13% /
/dev/sda1 256666 16053 227361 7% /boot
none 1037388 0 1037388 0% /dev/shm
Here is the output of uname -a:
Linux spco1pxya 2.6.9-34.ELsmp #1 SMP Fri Feb 24 16:54:53 EST 2006
i686 i686 i386 GNU/Linux
Here is how squid-2.6-STABLE6 was built:
./configure \
--enable-delay-pools \
--enable-useragent-log \
--enable-referer-log \
--disable-wccp \
--enable-err-languages="English French" \
--enable-linux-netfilter \
--disable-ident-lookups \
--enable-auth="basic digest ntlm" \
--enable-basic-auth-helpers="getpwnam LDAP MSNT multi-domain-NTLM
NCSA SASL SMB" \
--enable-ntlm-auth-helpers="fakeauth no_check
SMB" \
--enable-digest-auth-helpers="ldap
password" \
--enable-external-acl-helpers="ip_user ldap_group session unix_group
wbinfo_group" \
--with-large-files
Here is the excerpt from the squid-conf pertaining to cache-size:
cache_dir ufs /var/squid/cache 2048 16 256
I'd advise using aufs, as it's non-blocking. This will require a
recompile, with either --enable-async-io or --enable-storeio="ufs,
aufs". Another great CPU hog is the regex ACLs. Avoid them whenever
possible.
Even so, Squid shouldn't be crashing.
What configurations can I do to my OS and squid in order to get rid of
this bottleneck?
Cheers,
/Jason
Chris