RE: squid: Memory utilization higher than expected since moving from 3.3 to 3.4 and Vary: working

Martin Sperl <Martin.Sperl@xxxxxxxxxx> · Tue, 8 Jul 2014 10:20:47 +0000

The problem is that it is a "slow" leak - it takes some time (month) to find it...
Also it only happens on real live traffic with high volume plus high utilization of "Vary:"
Moving our prod environment to head would be quite a political issue inside our organization.
Arguing to go to the latest stable version 3.4.6 would be possible, but I doubt it would change a thing

In the meantime we have not restarted the squids yet, so we still got a bit of information available if needed.
But we cannot keep it up in this state much longer.

I created a core-dump, but analyzing that is hard.

Here the top strings from that 10GB core-file - taken via: strings corefile| sort | uniq -c | sort -rn | head -20).
This may give you some idea:
2071897 =0.7
1353960 Keep-Alive
1343528 image/gif
 877129 HTTP/1.1 200 OK
 855949  GMT
 852122 Content-Type
 851706 HTTP/
 851371 Date
 850485 Server
 848027 IEND
 821956 Content-Length
 776359 Content-Type: image/gif
 768935 Cache-Control
 760741 ETag
 743341 live
 720255 Connection
 677920 Connection: Keep-Alive
 676108 Last-Modified
 662765 Expires
 585139 X-Powered-By: Servlet/2.4 JSP/2.0

Another thing I thought we could do is:
* restart squids
* run mgr:mem every day and compare the daily changes for all the values (maybe others?)

Any other ideas how to "find" the issue?

Martin

-----Original Message-----
From: Amos Jeffries [mailto:squid3@xxxxxxxxxxxxx] 
Sent: Dienstag, 08. Juli 2014 11:40
To: squid-users@xxxxxxxxxxxxxxx
Subject: RE:  squid: Memory utilization higher than expected since moving from 3.3 to 3.4 and Vary: working

On 2014-07-08 19:38, Martin Sperl wrote:
> Well - there are 4 Points here:
> a) this is gradual behavior that takes month to accumulate
> b) extra used memory is much bigger than Cache (1.5x aprox), which
> looks wrong to me as the other parameters you have given
> (diskcache,...) are not
> c) we have not seen this behavior with 3.3 and have just switched to
> 3.4 two month ago and it is running since then showing this increase.
> d) if you look at the memory report (mgr:mem) which I have shared you
> find: 900MB for 542k HttpHeaderEntries and 2GB for 791k Short_Strings
> - and that is almost 3 out of 4.2G  for this report alone.
> 
> So I was guessing there was some sort of memory allocation bug or
> something - maybe something to do with the now working handling of
> "vary" headers, that triggers this memory leak.
> To me it looks as if those "Vary" nodes are not clean up properly and
> that is why we are slowly increasing memory...
> 

Hmm. Yes that does sound like what we call a pseudo-leak (same effects 
as a memory leak, but not an actual leak since the objects are still 
being reference-counted by something). The headers and HttpHeaderEntries 
are stored as collections of String's. The HttpHeaderEntries are 
ref-counted objects used by HttpRequest and HttpReply transaction 
objects, which are themselves ref-counted. If there are a lot of those 
being held onto for no reason this needs to be tracked down.

Are you in a position to test the 3.HEAD code to see if it is something 
we fixed already there or a new issue to debug from scratch?

> There are specially a few things in the msg:info block that made me 
> suspicious:
> Memory usage for squid via mallinfo():
>         Total space in arena:  -2064964 KB
>         Ordinary blocks:       2047465 KB 161900 blks
>         Small blocks:               0 KB      0 blks
>         Holding blocks:         38432 KB      9 blks
>         Free Small blocks:          0 KB
>         Free Ordinary blocks:   81875 KB
>         Total in use:           81875 KB -4%
>         Total free:             81875 KB -4%
>         Total size:            -2026532 KB
> Memory accounted for:
>         Total accounted:       1269447 KB -63%
>         memPool accounted:     9658055 KB -477%
>         memPool unaccounted:   -11684587 KB  -0%
>         memPoolAlloc calls: 507567155336
>         memPoolFree calls:  508713005568
> Especially those negative Numbers and percentages - but maybe that is
> just a reporting bug, which - if so - is hindering debugging the real
> thing

64-bit mallinfo() bugs produce corrupt output. The only uncorrupted 
values above are the memPool accounted KB size and probably the "calls" 
counts.

> 
> Thanks,
> 	Martin
> 
> P.s: it would be helpful if the output of mgr:mem would be formatted
> in a way that would allow it to get easily pasted into Excel or
> similar - no spaces in pool names would be sufficient...
> 

The report is a TSV spreadsheet. If you save it directly to a file then 
import into Excel selecting Tab separator instead of the default comma 
(CSV) it loads as a nice spreadsheet.

Amos

This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement,
you may review at http://www.amdocs.com/email_disclaimer.asp