On Thu, Jun 10, 2010 at 12:33:18AM +0000, Amos Jeffries wrote: > On Wed, 09 Jun 2010 22:29:50 +0300, Panagiotis Christias > <p.christias@xxxxxxxxxxx> wrote: > > Hello list (sorry for the long message), > > > > we are using eight servers (running FreeBSD 7.3 amd64 with 2G ram each) > > as transparent proxy servers (Squid 2.7.9 along with squidGuard 1.4) for > > > the content filtering service of the Greek School Network. > > > > Our current squid configuration includes: > > http_port 80 transparent > > http_port 8080 > > cache_mem 192 MB > > cache_swap_low 80 > > cache_swap_high 85 > > maximum_object_size 4096 KB > > ipcache_size 94208 > > ipcache_low 90 > > ipcache_high 95 > > cache_store_log none > > cache_dir null /tmp > > redirect_program /usr/local/bin/squidGuard > > redirect_children 48 > > redirector_bypass on > > uri_whitespace encode > > request_header_max_size 25 KB > > > > Although it is not peak season (schools have exams and are about to > > close for summer), we are experiencing high memory usage in some cases > > (see cache01 at http://noc.ntua.gr/~christia/squid-vsize.png). All boxes > > > are identical. The memory usage seems to be related to the type of the > > requests (not the volume, http://noc.ntua.gr/~christia/squid-reqs.png). > > In order to rule out any possibility of server specific problems we > > moved the client requests from one cache box to another and the memory > > allocation problem moved along (at first it was cache04 and now it is > > cache01). > > > > We compared the cache.log files and found no special messages in cache01 > > > or in any other server that showed increased memory usage. After looking > > > into the access.log we tend to believe that the memory leaps occur > > during periods that squid receives too many concurrent requests for > > sites that are responding too slowly or not responding at all > > (firewalled, not sending any packets back). In these cases, concurrent > > requests vary from 40 reqs/sec to 700 reqs/sec, the average service > > varies from 570 msec to 178502 msec and the HTTP return codes are > > usually TCP_MISS/417, TCP_MISS/504 and TCP_MISS/000. > > > > Cache manager reports for cache01 692882 allocated cbdata > > clientHttpRequest objects occupying 742381 KB and 100% of them in use > > (ten times more than the other cache boxes). > > > > Full report is available here: > > http://noc.ntua.gr/~christia/cache01.mem.20100609-21:35.txt > > http://noc.ntua.gr/~christia/cache01.info.20100609-21:35.txt > > > > At the same 'top' in cache01 reports about the squid process: > > 1575 MB virtual size > > 303 MB resident size > > > > Finally, pmap (FreeBSD sysutils/pmap port) report is available here (in > > case it is useful): > > http://noc.ntua.gr/~christia/cache01.pmap.20100609-21:35.txt > > > > Anyway, is this behaviour normal or is it a bug or memory leak? I am > > willing to try any suggestion, provide any other information and help > > debugging. > > > > For the moment, I am manually excluding from the transparent cache > > schema the sites that seem to cause problems. I am also considering > > adding a maxconn acl and lowering some connection timeouts. Waiting for > > 180000 msec for a firewalled connection is *too* much. > > > > Regards, > > Panagiotis > > Looks like broken almost-HTTP/1.1 clients to me. The behaviour when a > client sends "Expect: 100-continue" without properly supporting it > themselves. Squid does not fully support that feature of HTTP/1.1, just > enough to send back the right 417 error and negotiate its non-existence > with the client. > > What I think is happening is that many requests come in using the Expect: > header for HTTP/1.1 (which Squid does not yet support) the clients then > wait for the 100 status code to return. Which will never happen. > Standards-compliant Squid will reply with 417 errors and hope the client > retries without Expect:. > > Check that your cache01 is configured with "ignore_expect_100 off". This > will send back the 417 error immediately for any clients which are actually > working to negotiate a faster response back. Broken clients will die > immediately with the error instead of dying after holding your resources > for that timeout. ignore_expect_100 parameter is not set in any of our cache boxes's squid.conf. I assume that this means that the default value ("off") is used. Panagiotis -- Panagiotis J. Christias Network Management Center P.Christias@xxxxxxxxxxx National Technical Univ. of Athens, GREECE