Re: Increased memory usage (memory leak?)

Panagiotis Christias <p.christias@xxxxxxxxxxx> · Thu, 10 Jun 2010 13:52:11 +0300

On Thu, Jun 10, 2010 at 12:33:18AM +0000, Amos Jeffries wrote:
> On Wed, 09 Jun 2010 22:29:50 +0300, Panagiotis Christias
> <p.christias@xxxxxxxxxxx> wrote:
> > Hello list (sorry for the long message),
> > 
> > we are using eight servers (running FreeBSD 7.3 amd64 with 2G ram each) 
> > as transparent proxy servers (Squid 2.7.9 along with squidGuard 1.4) for
> 
> > the content filtering service of the Greek School Network.
> > 
> > Our current squid configuration includes:
> >    http_port           80 transparent
> >    http_port           8080
> >    cache_mem           192 MB
> >    cache_swap_low      80
> >    cache_swap_high     85
> >    maximum_object_size 4096 KB
> >    ipcache_size        94208
> >    ipcache_low         90
> >    ipcache_high        95
> >    cache_store_log     none
> >    cache_dir           null /tmp
> >    redirect_program    /usr/local/bin/squidGuard
> >    redirect_children   48
> >    redirector_bypass   on
> >    uri_whitespace      encode
> >    request_header_max_size 25 KB
> > 
> > Although it is not peak season (schools have exams and are about to 
> > close for summer), we are experiencing high memory usage in some cases 
> > (see cache01 at http://noc.ntua.gr/~christia/squid-vsize.png). All boxes
> 
> > are identical. The memory usage seems to be related to the type of the 
> > requests (not the volume, http://noc.ntua.gr/~christia/squid-reqs.png). 
> > In order to rule out any possibility of server specific problems we 
> > moved the client requests from one cache box to another and the memory 
> > allocation problem moved along (at first it was cache04 and now it is 
> > cache01).
> > 
> > We compared the cache.log files and found no special messages in cache01
> 
> > or in any other server that showed increased memory usage. After looking
> 
> > into the access.log we tend to believe that the memory leaps occur 
> > during periods that squid receives too many concurrent requests for 
> > sites that are responding too slowly or not responding at all 
> > (firewalled, not sending any packets back). In these cases, concurrent 
> > requests vary from 40 reqs/sec to 700 reqs/sec, the average service 
> > varies from 570 msec to 178502 msec and the HTTP return codes are 
> > usually TCP_MISS/417, TCP_MISS/504 and TCP_MISS/000.
> > 
> > Cache manager reports for cache01 692882 allocated cbdata 
> > clientHttpRequest objects occupying 742381 KB and 100% of them in use 
> > (ten times more than the other cache boxes).
> > 
> > Full report is available here:
> >    http://noc.ntua.gr/~christia/cache01.mem.20100609-21:35.txt
> >    http://noc.ntua.gr/~christia/cache01.info.20100609-21:35.txt
> > 
> > At the same 'top' in cache01 reports about the squid process:
> >    1575 MB virtual size
> >     303 MB resident size
> > 
> > Finally, pmap (FreeBSD sysutils/pmap port) report is available here (in 
> > case it is useful):
> >    http://noc.ntua.gr/~christia/cache01.pmap.20100609-21:35.txt
> > 
> > Anyway, is this behaviour normal or is it a bug or memory leak? I am 
> > willing to try any suggestion, provide any other information and help 
> > debugging.
> > 
> > For the moment, I am manually excluding from the transparent cache 
> > schema the sites that seem to cause problems. I am also considering 
> > adding a maxconn acl and lowering some connection timeouts. Waiting for 
> > 180000 msec for a firewalled connection is *too* much.
> > 
> > Regards,
> > Panagiotis
> 
> Looks like broken almost-HTTP/1.1 clients to me. The behaviour when a
> client sends "Expect: 100-continue" without properly supporting it
> themselves. Squid does not fully support that feature of HTTP/1.1, just
> enough to send back the right 417 error and negotiate its non-existence
> with the client.
> 
> What I think is happening is that many requests come in using the Expect:
> header for HTTP/1.1 (which Squid does not yet support) the clients then
> wait for the 100 status code to return. Which will never happen.
> Standards-compliant Squid will reply with 417 errors and hope the client
> retries without Expect:.
> 
> Check that your cache01 is configured with "ignore_expect_100 off". This
> will send back the 417 error immediately for any clients which are actually
> working to negotiate a faster response back. Broken clients will die
> immediately with the error instead of dying after holding your resources
> for that timeout.

ignore_expect_100 parameter is not set in any of our cache boxes's
squid.conf. I assume that this means that the default value ("off")
is used.

Panagiotis

-- 
Panagiotis J. Christias    Network Management Center
P.Christias@xxxxxxxxxxx    National Technical Univ. of Athens, GREECE