On 6/17/21 8:32 AM, Faiyaz Mohammed wrote: > > > On 6/16/2021 9:47 PM, Vlastimil Babka wrote: >> On 6/16/21 5:50 PM, Faiyaz Mohammed wrote: >>> >>> >>> On 6/16/2021 4:35 PM, Vlastimil Babka wrote: >>>> On 6/15/21 5:58 PM, Qian Cai wrote: >>>>> >>>>> >>>>> On 6/11/2021 3:03 PM, Faiyaz Mohammed wrote: >>>>>> alloc_calls and free_calls implementation in sysfs have two issues, >>>>>> one is PAGE_SIZE limitation of sysfs and other is it does not adhere >>>>>> to "one value per file" rule. >>>>>> >>>>>> To overcome this issues, move the alloc_calls and free_calls >>>>>> implementation to debugfs. >>>>>> >>>>>> Debugfs cache will be created if SLAB_STORE_USER flag is set. >>>>>> >>>>>> Rename the alloc_calls/free_calls to alloc_traces/free_traces, >>>>>> to be inline with what it does. >>>>>> >>>>>> Reviewed-by: Vlastimil Babka <vbabka@xxxxxxx> >>>>>> Reviewed-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> >>>>>> Signed-off-by: Faiyaz Mohammed <faiyazm@xxxxxxxxxxxxxx> >>>>> >>>>> Reverting this commit on today's linux-next fixed all leaks (hundreds) reported by kmemleak like below, >>>>> >>>>> unreferenced object 0xffff00091ae1b540 (size 64): >>>>> comm "lsbug", pid 1607, jiffies 4294958291 (age 1476.340s) >>>>> hex dump (first 32 bytes): >>>>> 02 00 00 00 00 00 00 00 6b 6b 6b 6b 6b 6b 6b 6b ........kkkkkkkk >>>>> 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk >>>>> backtrace: >>>>> [<ffff8000106b06b8>] slab_post_alloc_hook+0xa0/0x418 >>>>> [<ffff8000106b5c7c>] kmem_cache_alloc_trace+0x1e4/0x378 >>>>> [<ffff8000106b5e40>] slab_debugfs_start+0x30/0x50 >>>>> slab_debugfs_start at /usr/src/linux-next/mm/slub.c:5831 >>>>> [<ffff8000107b3dbc>] seq_read_iter+0x214/0xd50 >>>>> [<ffff8000107b4b84>] seq_read+0x28c/0x418 >>>>> [<ffff8000109560b4>] full_proxy_read+0xdc/0x148 >>>>> [<ffff800010738f24>] vfs_read+0x104/0x340 >>>>> [<ffff800010739ee0>] ksys_read+0xf8/0x1e0 >>>>> [<ffff80001073a03c>] __arm64_sys_read+0x74/0xa8 >>>>> [<ffff8000100358d4>] invoke_syscall.constprop.0+0xdc/0x1d8 >>>>> [<ffff800010035ab4>] do_el0_svc+0xe4/0x298 >>>>> [<ffff800011138528>] el0_svc+0x20/0x30 >>>>> [<ffff800011138b08>] el0t_64_sync_handler+0xb0/0xb8 >>>>> [<ffff80001001259c>] el0t_64_sync+0x178/0x17c >>>>> >>>> >>>> I think the problem is here: >>>> >>>>>> +static void slab_debugfs_stop(struct seq_file *seq, void *v) >>>>>> +{ >>>>>> + kfree(v); >>>>>> +} >>>>>> + >>>>>> +static void *slab_debugfs_next(struct seq_file *seq, void *v, loff_t *ppos) >>>>>> +{ >>>>>> + loff_t *spos = v; >>>>>> + struct loc_track *t = seq->private; >>>>>> + >>>>>> + if (*ppos < t->count) { >>>>>> + *ppos = ++*spos; >>>>>> + return spos; >>>>>> + } >>>>>> + *ppos = ++*spos; >>>>>> + return NULL; >>>>>> +} >>>> >>>> If we return NULL, then NULL is passed to slab_debugfs_stop and thus we don't >>>> kfree ppos. kfree(NULL) is silently ignored. >>>> >>> I think yes, if NULL passed to kfree, it simply do return. >>>> I think as we have private struct loc_track, we can add a pos field there and >>>> avoid the kmaloc/kfree altogether. >>>> >>> Hmm, yes we can add pos field "or" we can use argument "v" mean we can >>> update v with pos in ->next() and use in ->show() to avoid the leak >>> (kmalloc/kfree). >> >> Can you explain the "or" part more. It's exactly what we already do, no?I am thinking if we simplly do ppos return from slab_debugfs_start() and > in slab_debugfs_next() assign ppos to "v", update it and return if > records are there. something like below (approach 1): > ... > static void *slab_debugfs_next(struct seq_file *seq, void *v, loff_t *ppos) > { > ... > v = ppos; > if (*ppos < t->count) { > ++*ppos; > return v; > } > > ++*ppos; > return NULL; > } > ... > static void *slab_debugfs_start(struct seq_file *seq, loff_t *ppos) > { > return ppos; > } OK maybe that works too. Bonus points if some other code does that. And then it might be another reason to update the Documentation file. > ... > >> "v" as you said. The problem is, if next(); returns NULL, then stop() gets the >> NULL as "v". It's just what I see in the code of seq_read_iter() and traverse() >> in fs/seq_file.c. I don't see another way to say there are no more records to >> print - only to return NULL in next(). >> Ah, ok so we could maybe do the kfree() in next() then before returning NULL, >> which is the last moment we have the pointer. But really, if we already have a >> loc_track in private, why kmalloc an additional loff_t. >> > Yes, we can do kfree() before returning NULL, but better to add ppos in > lock_track. (approach 2) > >> Anyway it seems to me also that >> Documentation/filesystems/seq_file.rst should be updated, as the kfree() in >> stop() is exactly what it suggests, and it doesn't show how next() indicates >> that there are no more records by returning NULL, and what to do about kfree() then. > > Can you please suggest me which approach would be good to avoid the > leak?. I will update in next patch version. I guess your approach is simpler as we don't track two pos variables.