On Mon, Oct 08, 2018 at 10:05:28AM +0300, Mike Rapoport wrote: > On Sat, Oct 06, 2018 at 12:42:37AM +0000, Roman Gushchin wrote: > > Hi Daniel! > > > > On Fri, Oct 05, 2018 at 10:16:25AM +0000, Daniel McGinnes wrote: > > > Hi Roman, > > > > > > memory pressure was started after 1 hour (Ran stress --vm 16 --vm-bytes > > > 1772864000 -t 300 for 5 minutes, then sleep for 5 mins in a continuous > > > loop). > > > > > > Machine has 16 cores & 32 GB RAM. > > > > > > I think the issue I still have is that even though the per-cpu is able to > > > be reused for other per-cpu allocations, my understanding is that it will > > > not be available for general use by applications - so if percpu memory > > > usage is growing continuously (which we still see happening pretty slowly > > > - but over months it would be fairly significant) it means there will be > > > less memory available for applications to use. Please let me know if I've > > > mis-understood something here. > > > > Well, yeah, not looking good. > > > > > > > > After seeing several stacks in IPv6 in the memory leak output I ran a test > > > with IPv6 disabled on the host. Interestingly after 24 hours the Percpu > > > memory reported in meminfo seems to have flattened out, whereas with IPv6 > > > enabled it was still growing. MemAvailable is decreasing so slowly that I > > > need to leave it longer to draw any conclusions from that. > > > > Looks like there is a independent per-cpu memory leak somewhere in the ipv6 > > stack. Not sure, of course, but if the number of dying cgroups is not growing... > > There is a leak in the percpu allocator itself, it never frees some of its > metadata. I've sent the fix yesterday [1], I believe it will be merged in > 4.19. Perfect catch! > > Also, there was a recent fix for a leak ipv6 [2]. > > I'm now trying to see the dynamics of the percpu allocations, so I've > added yet another debugfs interface for percpu (below) similar to > /proc/vmallocinfo. I hope that by the end of the day I'll be able to see > what is causing to increase in percpu memory. Really looking forward for Daniel's test results: hopefully the leak will be gone at this point. Thanks!