On 3/9/2022 12:59 AM, Yi Zhang wrote:
On Tue, Mar 8, 2022 at 11:51 PM Max Gurtovoy <mgurtovoy@xxxxxxxxxx> wrote:
Hi Yi Zhang,
Please send the commands to repro.
I run the following with no success to repro:
for i in `seq 100`; do echo $i && cat /sys/kernel/debug/kmemleak &&
echo clear > /sys/kernel/debug/kmemleak && nvme reset /dev/nvme2 &&
sleep 5 && echo scan > /sys/kernel/debug/kmemleak ; done
Hi Max
Sorry, I should add more details when I report it.
The kmemleak observed when I was reproducing the "nvme reset" timeout
issue we discussed before[1], and the cmd I used are[2]
[1]
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flinux-nvme%2FCAHj4cs_ir917u7Up5PBfwWpZtnVLey69pXXNjFNAjbqQ5vwU0w%40mail.gmail.com%2FT%2F%23m5e6dcc434fc1925b18047c348226cfbc48ffbd14&data=04%7C01%7Cmgurtovoy%40nvidia.com%7C8cef8eb496e84d35f52308da01575419%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637823771831899724%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=kjMvRAWlBe1ym3FDQO1rdZ9%2FwtKQpscvXRG48aTt3L0%3D&reserved=0
[2]
# nvme connect to target
# nvme reset /dev/nvme0
# nvme disconnect-all
# sleep 10
# echo scan > /sys/kernel/debug/kmemleak
# sleep 60
# cat /sys/kernel/debug/kmemleak
Thanks I was able to repro it with the above commands.
Still not clear where is the leak is, but I do see some non-symmetric
code in the error flows that we need to fix. Plus the keep-alive timing
movement.
It will take some time for me to debug this.
Can you repro it with tcp transport as well ?
maybe add some debug prints to catch the exact flow it happens ?
-Max.
On 2/21/2022 1:37 PM, Yi Zhang wrote:
Hello
Below kmemleak triggered when I do nvme connect/reset/disconnect
operations on latest 5.17.0-rc5, pls check it.
# cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff8883e398bc00 (size 192):
comm "nvme", pid 2632, jiffies 4295317772 (age 2951.476s)
hex dump (first 32 bytes):
80 50 84 a3 ff ff ff ff 70 d4 12 67 81 88 ff ff .P......p..g....
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<00000000ecf84f29>] kmem_cache_alloc_trace+0x10e/0x220
[<0000000099bbcbaa>] blk_iolatency_init+0x4e/0x380
[<00000000e7a59176>] blkcg_init_queue+0x12e/0x610
[<00000000aade682c>] blk_alloc_queue+0x400/0x840
[<000000007ed43824>] blk_mq_init_queue_data+0x6a/0x100
[<00000000cbff6d39>] nvme_rdma_setup_ctrl+0x4ca/0x15f0 [nvme_rdma]
[<00000000a309d26c>] nvme_rdma_create_ctrl+0x7e5/0xa9f [nvme_rdma]
[<000000007d8b5cca>] nvmf_dev_write+0x44e/0xa39 [nvme_fabrics]
[<0000000031d8624b>] vfs_write+0x17e/0x9a0
[<00000000471d7945>] ksys_write+0xf1/0x1c0
[<00000000a963bc79>] do_syscall_64+0x3a/0x80
[<0000000005154fc2>] entry_SYSCALL_64_after_hwframe+0x44/0xae
unreferenced object 0xffff8883e398a700 (size 192):
comm "nvme", pid 2632, jiffies 4295317782 (age 2951.466s)
hex dump (first 32 bytes):
80 50 84 a3 ff ff ff ff 60 c8 12 67 81 88 ff ff .P......`..g....
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<00000000ecf84f29>] kmem_cache_alloc_trace+0x10e/0x220
[<0000000099bbcbaa>] blk_iolatency_init+0x4e/0x380
[<00000000e7a59176>] blkcg_init_queue+0x12e/0x610
[<00000000aade682c>] blk_alloc_queue+0x400/0x840
[<000000007ed43824>] blk_mq_init_queue_data+0x6a/0x100
[<000000004f80b965>] nvme_rdma_setup_ctrl+0xf37/0x15f0 [nvme_rdma]
[<00000000a309d26c>] nvme_rdma_create_ctrl+0x7e5/0xa9f [nvme_rdma]
[<000000007d8b5cca>] nvmf_dev_write+0x44e/0xa39 [nvme_fabrics]
[<0000000031d8624b>] vfs_write+0x17e/0x9a0
[<00000000471d7945>] ksys_write+0xf1/0x1c0
[<00000000a963bc79>] do_syscall_64+0x3a/0x80
[<0000000005154fc2>] entry_SYSCALL_64_after_hwframe+0x44/0xae
unreferenced object 0xffff8894253d9d00 (size 192):
comm "nvme", pid 2632, jiffies 4295331915 (age 2937.333s)
hex dump (first 32 bytes):
80 50 84 a3 ff ff ff ff 80 e0 12 67 81 88 ff ff .P.........g....
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<00000000ecf84f29>] kmem_cache_alloc_trace+0x10e/0x220
[<0000000099bbcbaa>] blk_iolatency_init+0x4e/0x380
[<00000000e7a59176>] blkcg_init_queue+0x12e/0x610
[<00000000aade682c>] blk_alloc_queue+0x400/0x840
[<000000007ed43824>] blk_mq_init_queue_data+0x6a/0x100
[<000000009f9abba5>] nvme_rdma_setup_ctrl.cold.70+0x5ee/0xb01 [nvme_rdma]
[<00000000a309d26c>] nvme_rdma_create_ctrl+0x7e5/0xa9f [nvme_rdma]
[<000000007d8b5cca>] nvmf_dev_write+0x44e/0xa39 [nvme_fabrics]
[<0000000031d8624b>] vfs_write+0x17e/0x9a0
[<00000000471d7945>] ksys_write+0xf1/0x1c0
[<00000000a963bc79>] do_syscall_64+0x3a/0x80
[<0000000005154fc2>] entry_SYSCALL_64_after_hwframe+0x44/0xae