Re: PROBLEM: Memory leaking when running kubernetes cronjobs

Roman Gushchin <guro@xxxxxx> · Wed, 26 Sep 2018 11:57:39 +0100

On Wed, Sep 26, 2018 at 09:00:20AM +0000, Daniel McGinnes wrote:
> Hi Roman,
> 
> I have attached a chart from my latest run, which collected 
> nr_dying_descendants stats, and had memory pressure being created along 
> with the test.
> 
> I let the test run for 1 hour and then started memory pressure as before (
> stress --vm 16 --vm-bytes 1772864000 -t 300 for 5 minutes, then sleep for 
> 5 mins in a continuous loop)
> 
> Even with the memory pressure the MemAvailable continued to drop, and 
> overall nr_dying_descendants continued to climb (although I believe some 
> were being reclaimed - as the number did not climb linearly, and from 
> looking at the stat manually I did see the number drop sometimes). 6 hours 
> in there was a big drop in nr_dying_descendants (4000 -> 400), and it 
> seems to have stabilised around that value since then. Even after that 
> MemAvailable continued to decrease at a similar rate to the run when I 
> didn't apply any memory pressure - although for the last 2 hours or so it 
> does seem to have stabilised..
> 
> Towards the end of the data in the chart I did echo 3 > 
> /proc/sys/vm/drop_caches and saw nr_dying_descendants drop to 40 - but saw 
> no increase in MemAvailable - so this seems to confirm your theory about 
> fragmentation in
> the per-cpu memory allocator. 
> 
> I'm a bit surprised that after we see nr_dying_descendants stabilise we 
> still see MemAvailable decreasing.. Any theories on this?

Hard to say, it might be some sort of unreclaimable slabs or per-cpu memory
as well. Both numbers are reflected in /proc, so it's possible to figure it out.

Thanks!