On Tue, Oct 1, 2024 at 9:57 PM Nilay Shroff <nilay@xxxxxxxxxxxxx> wrote: > > > > On 9/29/24 20:10, Yi Zhang wrote: > > Hello > > > > The kmemleak issue was easily triggered during blktests block/001 on > > the latest linux-block/for-next, > > please help check it and let me know if you need any info/test for it, thanks. > > > > > > $ cat /sys/kernel/debug/kmemleak > > unreferenced object 0xffff888cc28666c0 (size 32): > > comm "modprobe", pid 11054, jiffies 4305180646 > > hex dump (first 32 bytes): > > 73 64 65 62 75 67 5f 71 75 65 75 65 64 5f 63 6d sdebug_queued_cm > > 64 00 a6 02 7d c0 f5 04 00 00 00 00 00 00 00 00 d...}........... > > backtrace (crc 6250ed84): > > [<ffffffffb378fa9b>] __kmalloc_node_track_caller_noprof+0x36b/0x440 > > [<ffffffffb36513a6>] kstrdup+0x36/0x60 > > [<ffffffffb5555d13>] kobject_set_name_vargs+0x43/0x120 > > [<ffffffffb555639b>] kobject_init_and_add+0xdb/0x160 > > [<ffffffffb378f6c4>] sysfs_slab_add+0x194/0x1f0 > > [<ffffffffb3792286>] do_kmem_cache_create+0x256/0x2c0 > > [<ffffffffb367436f>] __kmem_cache_create_args+0x20f/0x310 > > [<ffffffffc36f45b8>] null_init+0x5a8/0xff0 [null_blk] > > [<ffffffffb2c03cec>] do_one_initcall+0x11c/0x5c0 > > [<ffffffffb30da9e8>] do_init_module+0x238/0x790 > > [<ffffffffb30de801>] init_module_from_file+0xd1/0x130 > > [<ffffffffb30deaa0>] idempotent_init_module+0x230/0x770 > > [<ffffffffb30df25e>] __x64_sys_finit_module+0xbe/0x130 > > [<ffffffffb56bba12>] do_syscall_64+0x92/0x180 > > [<ffffffffb580012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e > > unreferenced object 0xffff888c82cf69c0 (size 32): > > comm "modprobe", pid 11104, jiffies 4305186132 > > hex dump (first 32 bytes): > > 73 64 65 62 75 67 5f 71 75 65 75 65 64 5f 63 6d sdebug_queued_cm > > 64 00 ef 42 3d 89 fa 04 40 68 1d 36 00 ea ff ff d..B=...@h.6.... > > backtrace (crc 46e1640c): > > [<ffffffffb378fa9b>] __kmalloc_node_track_caller_noprof+0x36b/0x440 > > [<ffffffffb36513a6>] kstrdup+0x36/0x60 > > [<ffffffffb5555d13>] kobject_set_name_vargs+0x43/0x120 > > [<ffffffffb555639b>] kobject_init_and_add+0xdb/0x160 > > [<ffffffffb378f6c4>] sysfs_slab_add+0x194/0x1f0 > > [<ffffffffb3792286>] do_kmem_cache_create+0x256/0x2c0 > > [<ffffffffb367436f>] __kmem_cache_create_args+0x20f/0x310 > > [<ffffffffc36f65b8>] 0xffffffffc36f65b8 > > [<ffffffffb2c03cec>] do_one_initcall+0x11c/0x5c0 > > [<ffffffffb30da9e8>] do_init_module+0x238/0x790 > > [<ffffffffb30de801>] init_module_from_file+0xd1/0x130 > > [<ffffffffb30deaa0>] idempotent_init_module+0x230/0x770 > > [<ffffffffb30df25e>] __x64_sys_finit_module+0xbe/0x130 > > [<ffffffffb56bba12>] do_syscall_64+0x92/0x180 > > [<ffffffffb580012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e > > unreferenced object 0xffff888c49ee9700 (size 32): > > comm "modprobe", pid 12268, jiffies 4305219508 > > hex dump (first 32 bytes): > > 73 64 65 62 75 67 5f 71 75 65 75 65 64 5f 63 6d sdebug_queued_cm > > 64 00 ce 89 f6 a8 04 c4 00 00 00 00 00 00 00 00 d............... > > backtrace (crc 267cbe53): > > [<ffffffffb378fa9b>] __kmalloc_node_track_caller_noprof+0x36b/0x440 > > [<ffffffffb36513a6>] kstrdup+0x36/0x60 > > [<ffffffffb5555d13>] kobject_set_name_vargs+0x43/0x120 > > [<ffffffffb555639b>] kobject_init_and_add+0xdb/0x160 > > [<ffffffffb378f6c4>] sysfs_slab_add+0x194/0x1f0 > > [<ffffffffb3792286>] do_kmem_cache_create+0x256/0x2c0 > > [<ffffffffb367436f>] __kmem_cache_create_args+0x20f/0x310 > > [<ffffffffc36f65b8>] 0xffffffffc36f65b8 > > [<ffffffffb2c03cec>] do_one_initcall+0x11c/0x5c0 > > [<ffffffffb30da9e8>] do_init_module+0x238/0x790 > > [<ffffffffb30de801>] init_module_from_file+0xd1/0x130 > > [<ffffffffb30deaa0>] idempotent_init_module+0x230/0x770 > > [<ffffffffb30df25e>] __x64_sys_finit_module+0xbe/0x130 > > [<ffffffffb56bba12>] do_syscall_64+0x92/0x180 > > [<ffffffffb580012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e > > > > > > > > -- > > Best Regards, > > Yi Zhang > > > > > Apparently, the memory leak is detected in mm/slab code. I believe there's no issue in the > block layer code. After further debugging I found that the fix implemented in commit > 4ec10268ed98 ("mm, slab: unlink slabinfo, sysfs and debugfs immediately") caused the observed > symptom. The fix implemented in 4ec10268ed98 caused a subtle side effect due to which while > destroying the kmem cache, the code path would never get into sysfs_slab_release() function > even though SLAB_SUPPORTS_SYSFS is defined and slab state is FULL. Due to this side effect, > we would never release kobject defined for kmem cache and leak the associated memory. > > The issue here's with the use of __is_defined() macro in kmem_cache_release(). The > __is_defined() macro expands to __take_second_arg(arg1_or_junk 1, 0). If "arg1_or_junk" is > defined to 1 then it expands to __take_second_arg(0, 1, 0) and returns 1. If "arg1_or_junk" > is NOT defined to any value then it expands to __take_second_arg(... 1, 0) and returns 0. > > In this particular issue, SLAB_SUPPORTS_SYSFS is defined without any associated value and that > causes __is_defined(SLAB_SUPPORTS_SYSFS) to always evaluate to 0 and hence it would never invoke > sysfs_slab_release(). > > The following patch shall help fix this: Thanks for the debug and fix, confirmed the issue was fixed with this change. > > diff --git a/mm/slab.h b/mm/slab.h > index f22fb760b286..3e0a08ea4c42 100644 > --- a/mm/slab.h > +++ b/mm/slab.h > @@ -310,7 +310,7 @@ struct kmem_cache { > }; > > #if defined(CONFIG_SYSFS) && !defined(CONFIG_SLUB_TINY) > -#define SLAB_SUPPORTS_SYSFS > +#define SLAB_SUPPORTS_SYSFS 1 > void sysfs_slab_unlink(struct kmem_cache *s); > void sysfs_slab_release(struct kmem_cache *s); > #else > > I will post the above patch in mm/slab mailing list. > > Thanks, > --Nilay > -- Best Regards, Yi Zhang