On 10/14/24 11:20, Muchun Song wrote: > > >> On Oct 14, 2024, at 17:04, chenridong <chenridong@xxxxxxxxxx> wrote: >> >> >> >> On 2024/10/14 16:43, Muchun Song wrote: >>>> On Oct 14, 2024, at 16:13, Anshuman Khandual <anshuman.khandual@xxxxxxx> wrote: >>>> >>>> >>>> >>>> On 10/14/24 08:53, Chen Ridong wrote: >>>>> From: Chen Ridong <chenridong@xxxxxxxxxx> >>>>> >>>>> A memleak was found as bellow: >>>>> >>>>> unreferenced object 0xffff8881010d2a80 (size 32): >>>>> comm "mkdir", pid 1559, jiffies 4294932666 >>>>> hex dump (first 32 bytes): >>>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >>>>> 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... >>>>> backtrace (crc 2e7ef6fa): >>>>> [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 >>>>> [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 >>>>> [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 >>>>> [<ffffffff81198dd9>] online_css+0x29/0xa0 >>>>> [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 >>>>> [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 >>>>> [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 >>>>> [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 >>>>> [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 >>>>> [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 >>>>> [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 >>>>> [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e >>>>> >>>>> In the alloc_shrinker_info function, when shrinker_unit_alloc return >>>>> err, the info won't be freed. Just fix it. >>>>> >>>>> Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") >>>>> Signed-off-by: Chen Ridong <chenridong@xxxxxxxxxx> >>>>> --- >>>>> mm/shrinker.c | 1 + >>>>> 1 file changed, 1 insertion(+) >>>>> >>>>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>>>> index dc5d2a6fcfc4..92270413190d 100644 >>>>> --- a/mm/shrinker.c >>>>> +++ b/mm/shrinker.c >>>>> @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>>>> >>>>> err: >>>>> mutex_unlock(&shrinker_mutex); >>>>> + kvfree(info); >>>>> free_shrinker_info(memcg); >>>>> return -ENOMEM; >>>>> } >>>> >>>> There are two scenarios when "goto err:" gets called >>>> >>>> - When shrinker_info allocations fails, no kvfree() is required >>>> - but after this change kvfree() would be called even >>>> when the allocation had failed originally, which does >>>> not sound right >>> Yes. In this case, @info is NULL and kvfree could handle NULL. >>> It seems strange but the final behaviour correct. >>>> >>>> - shrinker_unit_alloc() fails, kvfree() is actually required >>>> >>>> I guess kvfree() should be called just after shrinker_unit_alloc() >>>> fails but before calling into "goto err". >>> We could do it like this, which avoids ambiguity (if someone ignores >>> that kvfree could handle NULL). Something like: >>> --- a/mm/shrinker.c >>> +++ b/mm/shrinker.c >>> @@ -88,13 +88,14 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>> goto err; >>> info->map_nr_max = shrinker_nr_max; >>> if (shrinker_unit_alloc(info, NULL, nid)) >>> - goto err; >>> + goto free; >>> rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >>> } >>> mutex_unlock(&shrinker_mutex); >>> return ret; >>> - >>> +free: >>> + kvfree(info); >>> err: >>> mutex_unlock(&shrinker_mutex); >>> free_shrinker_info(memcg); >>> Thanks. >>>> >>>> But curious, should not both kvzalloc_node()/kvfree() be avoided >>>> while inside mutex lock to avoid possible lockdep issues ? >> How about: >> >> diff --git a/mm/shrinker.c b/mm/shrinker.c >> index dc5d2a6fcfc4..7baee7f00497 100644 >> --- a/mm/shrinker.c >> +++ b/mm/shrinker.c >> @@ -87,9 +87,9 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >> if (!info) >> goto err; >> info->map_nr_max = shrinker_nr_max; >> + rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >> if (shrinker_unit_alloc(info, NULL, nid)) >> goto err; >> - rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >> } >> mutex_unlock(&shrinker_mutex); > > No. We should make sure the @info is fully initialized before others > could see it. That's why rcu_assign_pointer is used here. If the info is immediately visible, is the failure cleanup free_shrinker_info() safe? It uses kvfree(info) and not kvfree_rcu(), and shrinker_unit_free() is also doing kfree(). >> >> I think this is concise. >> >> Best regards, >> Ridong > >