On Wed, 09 Jun 2010 22:29:50 +0300, Panagiotis Christias <p.christias@xxxxxxxxxxx> wrote: > Hello list (sorry for the long message), > > we are using eight servers (running FreeBSD 7.3 amd64 with 2G ram each) > as transparent proxy servers (Squid 2.7.9 along with squidGuard 1.4) for > the content filtering service of the Greek School Network. > > Our current squid configuration includes: > http_port 80 transparent > http_port 8080 > cache_mem 192 MB > cache_swap_low 80 > cache_swap_high 85 > maximum_object_size 4096 KB > ipcache_size 94208 > ipcache_low 90 > ipcache_high 95 > cache_store_log none > cache_dir null /tmp > redirect_program /usr/local/bin/squidGuard > redirect_children 48 > redirector_bypass on > uri_whitespace encode > request_header_max_size 25 KB > > Although it is not peak season (schools have exams and are about to > close for summer), we are experiencing high memory usage in some cases > (see cache01 at http://noc.ntua.gr/~christia/squid-vsize.png). All boxes > are identical. The memory usage seems to be related to the type of the > requests (not the volume, http://noc.ntua.gr/~christia/squid-reqs.png). > In order to rule out any possibility of server specific problems we > moved the client requests from one cache box to another and the memory > allocation problem moved along (at first it was cache04 and now it is > cache01). > > We compared the cache.log files and found no special messages in cache01 > or in any other server that showed increased memory usage. After looking > into the access.log we tend to believe that the memory leaps occur > during periods that squid receives too many concurrent requests for > sites that are responding too slowly or not responding at all > (firewalled, not sending any packets back). In these cases, concurrent > requests vary from 40 reqs/sec to 700 reqs/sec, the average service > varies from 570 msec to 178502 msec and the HTTP return codes are > usually TCP_MISS/417, TCP_MISS/504 and TCP_MISS/000. > > Cache manager reports for cache01 692882 allocated cbdata > clientHttpRequest objects occupying 742381 KB and 100% of them in use > (ten times more than the other cache boxes). > > Full report is available here: > http://noc.ntua.gr/~christia/cache01.mem.20100609-21:35.txt > http://noc.ntua.gr/~christia/cache01.info.20100609-21:35.txt > > At the same 'top' in cache01 reports about the squid process: > 1575 MB virtual size > 303 MB resident size > > Finally, pmap (FreeBSD sysutils/pmap port) report is available here (in > case it is useful): > http://noc.ntua.gr/~christia/cache01.pmap.20100609-21:35.txt > > Anyway, is this behaviour normal or is it a bug or memory leak? I am > willing to try any suggestion, provide any other information and help > debugging. > > For the moment, I am manually excluding from the transparent cache > schema the sites that seem to cause problems. I am also considering > adding a maxconn acl and lowering some connection timeouts. Waiting for > 180000 msec for a firewalled connection is *too* much. > > Regards, > Panagiotis Looks like broken almost-HTTP/1.1 clients to me. The behaviour when a client sends "Expect: 100-continue" without properly supporting it themselves. Squid does not fully support that feature of HTTP/1.1, just enough to send back the right 417 error and negotiate its non-existence with the client. What I think is happening is that many requests come in using the Expect: header for HTTP/1.1 (which Squid does not yet support) the clients then wait for the 100 status code to return. Which will never happen. Standards-compliant Squid will reply with 417 errors and hope the client retries without Expect:. Check that your cache01 is configured with "ignore_expect_100 off". This will send back the 417 error immediately for any clients which are actually working to negotiate a faster response back. Broken clients will die immediately with the error instead of dying after holding your resources for that timeout. Amos