Hello list, I just wanted to post the results with valgrind. Unfortunately the memcheck thread needs so much CPU that I could not put a high load on the squid as maximum only about 5-10 req/s. # ./squid -v Squid Cache: Version 3.1.3 configure options: '--prefix=/appl' '--localstate=/var' '--with-filedescriptors=16384' '--enable-storeio=aufs' '--enable-auth=ntlm,basic' '--enable-external-acl-helpers=wbinfo_group' '--enable-icap-client' --enable-ltdl-convenience also recompiled and tried with: # squid -v Squid Cache: Version 3.1.3 configure options: '--prefix=/appl' '--localstate=/var' '--with-filedescriptors=16384' '--enable-storeio=aufs' '--enable-auth=ntlm,basic' '--enable-external-acl-helpers=wbinfo_group' '--enable-icap-client' '--with-valgrind-debug' 'CFLAGS=-g -O2' --enable-ltdl-convenience I ran valgrind repeatedly with: "valgrind --leak-check=yes -v squid -N &" and found: ==24141== 3,311,957 bytes in 3,784 blocks are definitely lost in loss record 26 of 27 ==24141== at 0x4A05809: malloc (vg_replace_malloc.c:149) ==24141== by 0x5ABAA7: xmalloc (util.c:508) ==24141== by 0x5AA35A: rfc1035MessageUnpack (rfc1035.c:433) ==24141== by 0x4B15A7: idnsGrokReply(char const*, unsigned long) (dns_interna l.cc:939) ==24141== by 0x4B22F0: idnsRead(int, void*) (dns_internal.cc:1178) ==24141== by 0x4AC154: comm_select (comm_epoll.cc:308) ==24141== by 0x5455AC: CommSelectEngine::checkEvents(int) (comm.cc:2682) ==24141== by 0x4B712D: EventLoop::checkEngine(AsyncEngine*, bool) (EventLoop.cc:51) ==24141== by 0x4B7282: EventLoop::runOnce() (EventLoop.cc:125) ==24141== by 0x4B7377: EventLoop::run() (EventLoop.cc:95) ==24141== by 0x4FB36C: SquidMain(int, char**) (main.cc:1379) ==24141== by 0x4FB975: main (main.cc:1141) I looked a bit in the source code but didn't really find what could cause this. Sometimes DNS did not seem to loose mem but I found this instead: ==29780== 987,870 (987,046 direct, 824 indirect) bytes in 1,321 blocks are definitely lost in loss record 27 of 28 ==29780== at 0x4A05809: malloc (vg_replace_malloc.c:149) ==29780== by 0x5ABAA7: xmalloc (util.c:508) ==29780== by 0x5ABBAB: xstrdup (util.c:756) ==29780== by 0x4B3E15: errorTryLoadText(char const*, char const*, bool) (errorpage.cc:313) ==29780== by 0x4B494F: ErrorState::BuildContent() (errorpage.cc:1007) ==29780== by 0x4B551D: ErrorState::BuildHttpReply() (errorpage.cc:881) ==29780== by 0x4B58E5: errorAppendEntry (errorpage.cc:432) ==29780== by 0x51D656: store_client::callback(long, bool) (store_client.cc:164) ==29780== by 0x51DA2F: store_client::scheduleMemRead() (store_client.cc:448) ==29780== by 0x51E567: storeClientCopy2(StoreEntry*, store_client*) (store_client.cc:331) ==29780== by 0x51E8D3: store_client::copy(StoreEntry*, StoreIOBuffer, void (*)(void*, StoreIOBuffer), void*) (store_client.cc:264) ==29780== by 0x4A0D0E: clientReplyContext::doGetMoreData() (client_side_reply.cc:1675) When running valgrind with 3.0.STABLE 23 I did not find similar lost blocks, only some KB lost when initializing but 3.1 looses some KB as well at that point. I monitored a squid3.0.STABLE25 and squid 3.1.3/3.1.4 over a longer period and found out that both need more memory over time but 3.0 eventually does not grow. 3.1 continues to grow until CPU rises to nearly 100%; then the memory consumption seem to stop. Has someone an idea where the problem could be? Martin Marcus Kool <marcus.kool@xxxxxxxxxxxxxxx> wrote on 17.06.2010 16:15:09: > Martin, > > Valgrind is a memory leak detection tool. > You need some developer skills to run it. > > If you have a test environment with low load you may want > to give it a try. > - download the squid sources > - run configure with CFLAGS="-g -O2" > - run squid with valgrind > - wait > - kill squid with a TERM signal and look and the valgrind log file > > Valgrind uses a lot of memory for its own administration and has > a lot of CPU overhead, so reduce cache_mem to a small value like 32MB. > > Most likely you will see many memory leaks because > Squid does not free everything when it exits. This is normal. > You need to look at repeated memory leaks; the leaks that > occur often and file a bug report. > > Please do not post the whole valgrind output to this list. > > Marcus > > > > Martin.Pichlmaier@xxxxxxxxxxxxxxxxxxxxxxxxxxx wrote: > > Hello, > > > > I just wanted to report back the last tests: > > > > After the memory cache is filled to 100% the squid (3.1.4 or 3.1.3) > > still needs more memory over time when under load, about 1-2 GB a day. > > memory_pool off did not change anything, the process size still rises. > > The high CPU usage seem to start when rising over a certain size limit > > but I am not sure about that. > > Example memory consuption of squid-3.1.4: > > from 8.0 GB (4pm) to 8.4 GB (7pm) to 8.5 GB (4am next day) to 9,4 GB > > (2pm). > > At night there is low load on the squid, maybe 20-50 req/s. > > 3.1.3 behaves the same, so it does not seem to be related to the > > "ctx enter/exit level" topic discussed in the last mails. > > > > I am now reverting the proxies back to 3.0.STABLE25 but will keep one > > proxy > > on 3.1.4 for testing. > > Probably something in my setup causes squid to consume too much memory. > > > > Amos, do you have another idea what might causes this and where to look, > > for example which debug depth? I can do some tests and have the > > possibility > > to slowly put this proxy under load and take it out of productive > > environment afterwards... > > > > Regards, > > Martin > > > > > > > > Amos Jeffries <squid3@xxxxxxxxxxxxx> wrote on 15.06.2010 13:31:40: > > > > <snip> > >>> I am now checking with mem pools off on one of the proxies and report > >>> later whether it changes anything. > > <snip> > >>> 2010/06/15 11:37:06| ctx: enter level 2059: '<lost>' > >>> 2010/06/15 11:37:06| ctx: enter level 2060: '<lost>' > >>> X:}0/06/15 11:37:06| WARNING: suspicious CR characters in HTTP header > > { > >>> 2010/06/15 11:37:06| ctx: exit level 2060 > >>> 2010/06/15 11:37:06| ctx: enter level 2060: '<lost>' > >>> X:}0/06/15 11:37:06| WARNING: suspicious CR characters in HTTP header > > { > >>> 2010/06/15 11:37:06| ctx: exit level 2060 > >>> 2010/06/15 11:37:06| ctx: enter level 2060: '<lost>' > >>> X:}0/06/15 11:37:06| WARNING: suspicious CR characters in HTTP header > > { > >>> 2010/06/15 11:37:06| ctx: exit level 2060 > >>> 2010/06/15 11:37:06| ctx: enter level 2060: '<lost>' > >>> X:}0/06/15 11:37:06| WARNING: suspicious CR characters in HTTP header > > { > >>> 2010/06/15 11:37:06| ctx: exit level 2060 > >>> 2010/06/15 11:37:06| ctx: enter level 2060: '<lost>' > >>> X:}0/06/15 11:37:06| WARNING: suspicious CR characters in HTTP header > > { > >>> 2010/06/15 11:40:56| ctx: exit level 2060 > >>> 2010/06/15 11:40:56| ctx: enter level 2060: '<lost>' > >>> 2010/06/15 11:40:56| ctx: enter level 2061: '<lost>' > >>> > >> Ouch. We've been wondering about these ctx loops. It is not something to > > > >> be terribly worried about, but can cause some "weird stuff" (yes that is > > > >> probably the best explanation). > >> > >> Thanks to your reminder, I've just had another look and found one more > >> in header processing. Hopefully that was it. > >> > >> Amos > >> -- > >> Please be using > >> Current Stable Squid 2.7.STABLE9 or 3.1.4 > > > > > >