Re: CGroup unused allocated slab objects will not get released

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Saeed!

On Wed, Sep 18, 2019 at 11:48:19PM +0000, Saeed Karimabadi (skarimab) wrote:
> Hi Roman,
> 
> Thanks for your prompt reply and also sharing your patch. 
> I did build kernel 5.3.0 with your patch and I can confirm your patch fixes the problem I was describing. 
> I used Qemu for this test and the script ran 1000 tasks concurrently in 100 different cgroups.
> I'm wondering if your could has gone through any long term regression test?

Thank you for testing it!
We've tested on different fb production workloads, and it was doing great.
There were significant memory savings and no noticeable cpu regression in
all tested environments.
If you've any tests you can run and share results, I'd appreciate it.

> Do you see any possible simple patch that can fix this excessive memory usage in older kernel code like 4.x versions?

This patchset is definitely too heavy to backport to 4.x. As a workaround
you can disable the kernel memory accounting using a boot option, if it's
acceptable.

Thanks!

> 
> Here are more detail information about the test results:
> 
> ******************************************************************************
> Your proposed patche back-ported to Kernel 5.3.0 :
>   https://github.com/rgushchin/linux/tree/new_slab.rfc.v5.3
> ------------- Before Running the script  -------------
> Slab:                      42756 kB
> SReclaimable:      25408 kB
> SUnreclaim:          17348 kB
> # name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : 
> 	            tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
> task_struct          102    200   3200   10    8 : tunables    0    0    0 : slabdata     20     20      0
> ------------- After running the script -------------
> Slab:                      43736 kB
> SReclaimable:      25484 kB
> SUnreclaim:         18252 kB
> task_struct          149    220   3200   10    8 : tunables    0    0    0 : slabdata     22     22      0
> 
> ******************************************************************************
> Vanilla Kernel 5.3.0 :
> ------------- Before Running the script  -------------
> Slab:                      34704 kB
> SReclaimable:      19956 kB
> SUnreclaim:          14748 kB
> # name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : 
>                            tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
> task_struct           99    130   3200   10    8 : tunables    0    0    0 : slabdata     13     13      0
> ------------- After running the script -------------
> Slab:                      59388 kB
> SReclaimable:      23580 kB
> SUnreclaim:          35808 kB
> task_struct         1174   1230   3200   10    8 : tunables    0    0    0 : slabdata    123    123      0
> 
> Regards,
> Saeed
> 
> 
> -----Original Message-----
> From: Roman Gushchin <guro@xxxxxx> 
> Sent: Wednesday, September 18, 2019 3:23 PM
> To: Saeed Karimabadi (skarimab) <skarimab@xxxxxxxxx>
> Cc: Christoph Lameter <cl@xxxxxxxxx>; Pekka Enberg <penberg@xxxxxxxxxx>; David Rientjes <rientjes@xxxxxxxxxx>; Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>; Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>; linux-mm@xxxxxxxxx; Tejun Heo <tj@xxxxxxxxxx>; Li Zefan <lizefan@xxxxxxxxxx>; Johannes Weiner <hannes@xxxxxxxxxxx>; cgroups@xxxxxxxxxxxxxxx; Michal Hocko <mhocko@xxxxxxxxxx>; Vladimir Davydov <vdavydov.dev@xxxxxxxxx>; xe-linux-external(mailer list) <xe-linux-external@xxxxxxxxx>
> Subject: Re: CGroup unused allocated slab objects will not get released
> 
> On Wed, Sep 18, 2019 at 08:31:18PM +0000, Saeed Karimabadi (skarimab) wrote:
> > Hi  Kernel Maintainers,
> > 
> > We are chasing an issue where slab allocator is not releasing task_struct slab objects allocated by cgroups 
> > and we are wondering if this is a known issue or an expected behavior ?
> > If we stress test the system and spawn multiple tasks with different cgroups, number of active allocated 
> > task_struct objects will increase but kernel will never release those memory later on, even though if system 
> > goes to the idle state with lower number of the running processes.
> 
> Hi Saeed!
> 
> I've recently proposed a new slab memory cgroup controller, which aims to solve
> the problem you're describing: https://urldefense.proofpoint.com/v2/url?u=https-3A__lwn.net_Articles_798605_&d=DwIFAw&c=5VD0RTtNlTh3ycd41b3MUw&r=jJYgtDM7QT-W-Fz_d29HYQ&m=fWQormdkeCMUp9VGpxmefgOpLEKeqxTz7u4jw51PDAQ&s=g-9JRnTKBsVSQ7w6U_mpQ5hrjXcCKOXuYSIsTSCuTck&e=  . It also generally
> reduces the amount of memory used by slabs.
> 
> I've been told that not all e-mails in the patchset reached lkml,
> so, please, find the original patchset here:
>   https://github.com/rgushchin/linux/tree/new_slab.rfc
> and it's backport to the 5.3 release here:
>   https://github.com/rgushchin/linux/tree/new_slab.rfc.v5.3
> 
> If you can try it on your setup, I'd appreciate it a lot, and it also can
> help with merging it upstream soon.
> 
> Thank you!
> 
> Roman




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux