Hi Roman, yeah, from what I could see docker/kube don't support cgroups v2 yet, it would be great if you could help with a patch for debugging. I ran stress in parallel with the workload. drop_caches cleared some also, but a lot was left leaked still even after that. My plan is to let it run over the weekend as it is, so I can do a direct comparison with the run without the patches. Monday I'll drop_caches, then create some more regular ambient memory pressure and see what happens. thanks, Dan McGinnes IBM Cloud - Containers performance Int Tel: 247359 Ext Tel: 01962 817359 Notes: Daniel McGinnes/UK/IBM Email: MCGINNES@xxxxxxxxxx IBM (UK) Ltd, Hursley Park,Winchester,Hampshire, SO21 2JN From: Roman Gushchin <guro@xxxxxx> To: Daniel McGinnes <MCGINNES@xxxxxxxxxx> Cc: "cgroups@xxxxxxxxxxxxxxx" <cgroups@xxxxxxxxxxxxxxx>, Nathaniel Rockwell <nrockwell@xxxxxxxxxx> Date: 20/09/2018 17:38 Subject: Re: PROBLEM: Memory leaking when running kubernetes cronjobs On Thu, Sep 20, 2018 at 08:23:06AM +0000, Daniel McGinnes wrote: > Hi Roman, > > unfortunately Kubernetes seems to be using version 1 cgroups, so I can't > see that stat - I'll investigate if there's a way to get Kube to use V2 so > we can check this.. Hi Daniel! Yeah, it might be not so easy, AFAIK. Alternatively, you can expose this cgroup v2 data in v1 interface using an off-stream patch, just for debugging. Should be pretty straightforward; I can help with it, if necessary. > > There wasn't memory pressure, I just run it in a pretty controlled way > when running the test - so initially it sounds like what I saw was > expected. I then ran stress --vm 16 --vm-bytes 2147483648 which did create > some memory pressure and I saw oom killer getting invoked - it seemed > pretty similar behaviour to before where only a small amount of the "lost" > memory was reclaimed... Maybe I was being too severe with stress and the > memory would be reclaimed at a slower rate under more reasonable memory > pressure? So, did you run the stress -vm after the main workload or in parallel? Can you, please, try to create some ambient memory pressure? Does echo 3 > /proc/sys/vm/drop_caches help to reclaim the memory? Thanks! Roman Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU