Re: High CPU usage and degraded service time after 2 weeks of activity

leongmzlist <leongmzlist@xxxxxxxxx> · Tue, 26 Feb 2008 15:22:28 -0800

What is the swap usage?  I once had the same problem w/ squid 
degrading over time.  I had to reduce the cache_mem from 2GB to 
512MB, and reduce the amount of objects in the cache since the index 
was growing too big.

mike

At 11:35 AM 2/26/2008, Guillaume Smet wrote:
Hi squid-users,

We recently experienced a problem on our new Squid setup (2 Squid
servers configured as reverse proxy - mostly the same configuration as
before except we allocated more memory and disk on the new servers -
the old boxes didn't have this problem). After 2 weeks of very good
performances, both Squid instances have begun to use a lot of CPU
resources (between 75 and 100% of one core instead of between 0 and
10%): performances started to be really bad especially during peak
hours.
Same number of queries/s, same hit ratio but service time was really
degraded. We let both Squid running like that for a week and the
situation didn't improve.

We restarted both Squid servers today and it fixed the problem for
now: service time is back to normal.

We found nothing in the cache.log. We decided to run one of the
servers with full debug for a couple of minutes to see if we could
find useful information. During these two minutes, we have a lot of
clientReadRequest: FD XXX: no data to process ((11) Resource
temporarily unavailable) in the logs (20k during these 2 minutes) but
we don't know if it can be related. Here is a bit of context around a
Resource temporarily unavailable line.
2008/02/26 17:25:55| destroying entry 0x3b0e2500: 'Connection: keep-alive'
2008/02/26 17:25:55| cbdataFree: 0x2aab3c2ce2a8
2008/02/26 17:25:55| cbdataFree: 0x2aab3c2ce2a8 has 1 locks, not freeing
2008/02/26 17:25:55| clientKeepaliveNextRequest: FD 659 reading next req
2008/02/26 17:25:55| commSetTimeout: FD 659 timeout 120
2008/02/26 17:25:55| clientReadRequest: FD 659: reading request...
2008/02/26 17:25:55| clientReadRequest: FD 659: no data to process
((11) Resource temporarily unavailable)
2008/02/26 17:25:55| cbdataLock: 0x2aab0bdf7418
2008/02/26 17:25:55| cbdataValid: 0x2aab0bdf7418
2008/02/26 17:25:55| cbdataUnlock: 0x2aab0bdf7418
2008/02/26 17:25:55| commSetSelect: FD 659 type 1
2008/02/26 17:25:55| commSetEvents(fd=659)
2008/02/26 17:25:55| cbdataUnlock: 0x2aab3c2ce2a8

We also noticed that we have negative numbers in the memory
information of the cachemgr but we don't know if it's relevant:
Memory usage for squid via mallinfo():
        Total space in arena:  -1419876 KB
        Ordinary blocks:       -1420149 KB    579 blks
        Small blocks:               0 KB      0 blks
        Holding blocks:          7564 KB      8 blks
        Free Small blocks:          0 KB
        Free Ordinary blocks:     272 KB
        Total in use:          -1412585 KB 100%
        Total free:               272 KB 0%
        Total size:            -1412312 KB

Background information:
CentOS 5 x86_64
Squid 2.6STABLE18
8GB of memory
one Xeon E5345 @ 2.33GHz per box
~ 15 Mb/s per box during peak hours
~ 200 requests/s

Cache configuration:
cache_mem 2000 MB
cache_dir aufs /data/services/squid/cache 8000 16 256

cache_swap_low 90
cache_swap_high 95
cache_replacement_policy lru
memory_replacement_policy heap GDSF
maximum_object_size_in_memory 150 KB

The setup is a reverse proxy setup with several ACLs, 2 active ports,
2 delay pools, ICP between both servers but nothing really fancy. I
can provide the full squid.conf if needed.

The Squid process was using approximately 3.2 GB of memory on each box.

Does anybody have any idea on how we can fix this problem or how we
can diagnose what happens?

Thanks in advance.

--
Guillaume