Re: [bug report] kmemleak in rdma_core observed during blktests nvme/rdma use siw

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 4/8/24 14:03, Yi Zhang wrote:
Hi
I found the below kmemleak issue during blktests nvme/rdma on the
latest linux-rdma/for-next, please help check it and let me know if
you need any info/testing for it, thanks.

Could you share which test case caused the issue? I can't reproduce
it with 6.9-rc3+ kernel (commit 586b5dfb51b) with the below.

use_siw=1 nvme_trtype=rdma ./check nvme/

# dmesg | grep kmemleak
[   67.130652] kmemleak: Kernel memory leak detector initialized (mem
pool available: 36041)
[   67.130728] kmemleak: Automatic memory scanning thread started
[ 1051.771867] kmemleak: 2 new suspected memory leaks (see
/sys/kernel/debug/kmemleak)
[ 1832.796189] kmemleak: 8 new suspected memory leaks (see
/sys/kernel/debug/kmemleak)
[ 2578.189075] kmemleak: 17 new suspected memory leaks (see
/sys/kernel/debug/kmemleak)
[ 3330.710984] kmemleak: 4 new suspected memory leaks (see
/sys/kernel/debug/kmemleak)

unreferenced object 0xffff88855da53400 (size 192):
   comm "rdma", pid 10630, jiffies 4296575922
   hex dump (first 32 bytes):
     37 00 00 00 00 00 00 00 c0 ff ff ff 1f 00 00 00  7...............
     10 34 a5 5d 85 88 ff ff 10 34 a5 5d 85 88 ff ff  .4.].....4.]....
   backtrace (crc 47f66721):
     [<ffffffff911251bd>] kmalloc_trace+0x30d/0x3b0
     [<ffffffffc2640ff7>] alloc_gid_entry+0x47/0x380 [ib_core]
     [<ffffffffc2642206>] add_modify_gid+0x166/0x930 [ib_core]

I guess add_modify_gid is called from config_non_roce_gid_cache, not sure
why we don't check the return value of it here.

Looks put_gid_entry is called in case add_modify_gid returns failure, it would
trigger schedule_free_gid -> queue_work(ib_wq, &entry->del_work), then
free_gid_work -> free_gid_entry_locked would free storage asynchronously by
put_gid_ndev and also entry.

     [<ffffffffc2643468>] ib_cache_update.part.0+0x6d8/0x910 [ib_core]
     [<ffffffffc2644e1a>] ib_cache_setup_one+0x24a/0x350 [ib_core]
     [<ffffffffc263949e>] ib_register_device+0x9e/0x3a0 [ib_core]
     [<ffffffffc2a3d389>] 0xffffffffc2a3d389
     [<ffffffffc2688cd8>] nldev_newlink+0x2b8/0x520 [ib_core]
     [<ffffffffc2645fe3>] rdma_nl_rcv_msg+0x2c3/0x520 [ib_core]
     [<ffffffffc264648c>]
rdma_nl_rcv_skb.constprop.0.isra.0+0x23c/0x3a0 [ib_core]
     [<ffffffff9270e7b5>] netlink_unicast+0x445/0x710
     [<ffffffff9270f1f1>] netlink_sendmsg+0x761/0xc40
     [<ffffffff9249db29>] __sys_sendto+0x3a9/0x420
     [<ffffffff9249dc8c>] __x64_sys_sendto+0xdc/0x1b0
     [<ffffffff92db0ad3>] do_syscall_64+0x93/0x180
     [<ffffffff92e00126>] entry_SYSCALL_64_after_hwframe+0x71/0x79

After ib_cache_setup_one failed, maybe ib_cache_cleanup_one is needed
which flush ib_wq to ensure storage is freed. Could you try with the change?

--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -1388,7 +1388,7 @@ int ib_register_device(struct ib_device *device, const char *name,
        if (ret) {
                dev_warn(&device->dev,
                         "Couldn't set up InfiniBand P_Key/GID cache\n");
-               return ret;
+               goto cache_cleanup;
        }

Thanks,
Guoqing




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux