We recently upgraded our extremely old squid servers (P3-1GHz, 512MB ram) to modern hardware (3GHz P4, 2GB ram) running RedHat
linux. The problem I am experiencing is that CPU load will jump to 100% and stay there. The system shows about 50% iowait cpu
usage. We have about 1500 employees using two such machines. Also, squid seems to dump core and restart several times during all
this. My coredump_dir directory has many core files in it, about 10 or 11 per day when the load-spike happens. Some days, things
are just fine.
I haven't had much time to validate all of this as I am just back from my holiday and this is what I see on the systems for now. At
the moment, the machines are playing nice, likely because the bulk of user-load has been switched back to the old proxies for the
time being.
Disk-wise, the servers each have two 146GB scsi disks in a hardware mirror with all space (except /boot and swap) allocated to "/".
Not ideal, but I am not the one responsible for the hardware config. The filesystem is ext3.
Here is the output of df -k:
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda3 139707024 16431300 116179012 13% /
/dev/sda1 256666 16053 227361 7% /boot
none 1037388 0 1037388 0% /dev/shm
Here is the output of uname -a:
Linux spco1pxya 2.6.9-34.ELsmp #1 SMP Fri Feb 24 16:54:53 EST 2006 i686 i686 i386 GNU/Linux
Here is how squid-2.6-STABLE6 was built:
./configure \
--enable-delay-pools \
--enable-useragent-log \
--enable-referer-log \
--disable-wccp \
--enable-err-languages="English French" \
--enable-linux-netfilter \
--disable-ident-lookups \
--enable-auth="basic digest ntlm" \
--enable-basic-auth-helpers="getpwnam LDAP MSNT multi-domain-NTLM NCSA SASL SMB" \
--enable-ntlm-auth-helpers="fakeauth no_check SMB" \
--enable-digest-auth-helpers="ldap password" \
--enable-external-acl-helpers="ip_user ldap_group session unix_group wbinfo_group" \
--with-large-files
Here is the excerpt from the squid-conf pertaining to cache-size:
cache_dir ufs /var/squid/cache 2048 16 256
What configurations can I do to my OS and squid in order to get rid of this bottleneck?
Cheers,
/Jason