On 8/07/2014 10:20 p.m., Martin Sperl wrote: > The problem is that it is a "slow" leak - it takes some time (month) to find it... > Also it only happens on real live traffic with high volume plus high utilization of "Vary:" > Moving our prod environment to head would be quite a political issue inside our organization. > Arguing to go to the latest stable version 3.4.6 would be possible, but I doubt it would change a thing > > In the meantime we have not restarted the squids yet, so we still got a bit of information available if needed. > But we cannot keep it up in this state much longer. > > I created a core-dump, but analyzing that is hard. > > Here the top strings from that 10GB core-file - taken via: strings corefile| sort | uniq -c | sort -rn | head -20). > This may give you some idea: > 2071897 =0.7 > 1353960 Keep-Alive > 1343528 image/gif > 877129 HTTP/1.1 200 OK > 855949 GMT > 852122 Content-Type > 851706 HTTP/ > 851371 Date > 850485 Server > 848027 IEND > 821956 Content-Length > 776359 Content-Type: image/gif > 768935 Cache-Control > 760741 ETag > 743341 live > 720255 Connection > 677920 Connection: Keep-Alive > 676108 Last-Modified > 662765 Expires > 585139 X-Powered-By: Servlet/2.4 JSP/2.0 > > Another thing I thought we could do is: > * restart squids > * run mgr:mem every day and compare the daily changes for all the values (maybe others?) > > Any other ideas how to "find" the issue? > Possibly a list of the mgr:filedescriptors open will show if there are any hung connections/transactions, or long-polling connections holding onto state. Do you have the mgr:mem reports over the last few days? I can start analysing to see if anything else pops out at me. Amos