# nvme connect to target
# nvme reset /dev/nvme0
# nvme disconnect-all
# sleep 10
# echo scan > /sys/kernel/debug/kmemleak
# sleep 60
# cat /sys/kernel/debug/kmemleak
Thanks I was able to repro it with the above commands.
Still not clear where is the leak is, but I do see some non-symmetric
code in the error flows that we need to fix. Plus the keep-alive timing
movement.
It will take some time for me to debug this.
Can you repro it with tcp transport as well ?
Yes, nvme/tcp also can reproduce it, here is the log:
unreferenced object 0xffff8881675f7000 (size 192):
comm "nvme", pid 3711, jiffies 4296033311 (age 2272.976s)
hex dump (first 32 bytes):
20 59 04 92 ff ff ff ff 00 00 da 13 81 88 ff ff Y..............
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<00000000adbc7c81>] kmem_cache_alloc_trace+0x10e/0x220
[<00000000c04d85be>] blk_iolatency_init+0x4e/0x380
[<00000000897ffdaf>] blkcg_init_queue+0x12e/0x610
[<000000002653e58d>] blk_alloc_queue+0x400/0x840
[<00000000fcb99f3c>] blk_mq_init_queue_data+0x6a/0x100
[<00000000486936b6>] nvme_tcp_setup_ctrl+0x70c/0xbe0 [nvme_tcp]
[<000000000bb29b26>] nvme_tcp_create_ctrl+0x953/0xbb4 [nvme_tcp]
[<00000000ca3d4e54>] nvmf_dev_write+0x44e/0xa39 [nvme_fabrics]
[<0000000056b79a25>] vfs_write+0x17e/0x9a0
[<00000000a5af6c18>] ksys_write+0xf1/0x1c0
[<00000000c035c128>] do_syscall_64+0x3a/0x80
[<000000000e5ea863>] entry_SYSCALL_64_after_hwframe+0x44/0xae
unreferenced object 0xffff8881675f7600 (size 192):
comm "nvme", pid 3711, jiffies 4296033320 (age 2272.967s)
hex dump (first 32 bytes):
20 59 04 92 ff ff ff ff 00 00 22 92 81 88 ff ff Y........".....
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<00000000adbc7c81>] kmem_cache_alloc_trace+0x10e/0x220
[<00000000c04d85be>] blk_iolatency_init+0x4e/0x380
[<00000000897ffdaf>] blkcg_init_queue+0x12e/0x610
[<000000002653e58d>] blk_alloc_queue+0x400/0x840
[<00000000fcb99f3c>] blk_mq_init_queue_data+0x6a/0x100
[<000000006ca5f9f6>] nvme_tcp_setup_ctrl+0x772/0xbe0 [nvme_tcp]
[<000000000bb29b26>] nvme_tcp_create_ctrl+0x953/0xbb4 [nvme_tcp]
[<00000000ca3d4e54>] nvmf_dev_write+0x44e/0xa39 [nvme_fabrics]
[<0000000056b79a25>] vfs_write+0x17e/0x9a0
[<00000000a5af6c18>] ksys_write+0xf1/0x1c0
[<00000000c035c128>] do_syscall_64+0x3a/0x80
[<000000000e5ea863>] entry_SYSCALL_64_after_hwframe+0x44/0xae
unreferenced object 0xffff8891fb6a3600 (size 192):
comm "nvme", pid 3711, jiffies 4296033511 (age 2272.776s)
hex dump (first 32 bytes):
20 59 04 92 ff ff ff ff 00 00 5c 1d 81 88 ff ff Y........\.....
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<00000000adbc7c81>] kmem_cache_alloc_trace+0x10e/0x220
[<00000000c04d85be>] blk_iolatency_init+0x4e/0x380
[<00000000897ffdaf>] blkcg_init_queue+0x12e/0x610
[<000000002653e58d>] blk_alloc_queue+0x400/0x840
[<00000000fcb99f3c>] blk_mq_init_queue_data+0x6a/0x100
[<000000004a3bf20e>] nvme_tcp_setup_ctrl.cold.57+0x868/0xa5d [nvme_tcp]
[<000000000bb29b26>] nvme_tcp_create_ctrl+0x953/0xbb4 [nvme_tcp]
[<00000000ca3d4e54>] nvmf_dev_write+0x44e/0xa39 [nvme_fabrics]
[<0000000056b79a25>] vfs_write+0x17e/0x9a0
[<00000000a5af6c18>] ksys_write+0xf1/0x1c0
[<00000000c035c128>] do_syscall_64+0x3a/0x80
[<000000000e5ea863>] entry_SYSCALL_64_after_hwframe+0x44/0xae
Looks like there is some asymmetry on blk_iolatency. It is intialized
when allocating a request queue and exited when deleting a genhd. In
nvme we have request queues that will never have genhd that corresponds
to them (like the admin queue).
Does this patch eliminate the issue?
--
diff --git a/block/blk-core.c b/block/blk-core.c
index 94bf37f8e61d..6ccc02a41f25 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -323,6 +323,7 @@ void blk_cleanup_queue(struct request_queue *q)
blk_queue_flag_set(QUEUE_FLAG_DEAD, q);
+ rq_qos_exit(q);
blk_sync_queue(q);
if (queue_is_mq(q)) {
blk_mq_cancel_work_sync(q);
diff --git a/block/genhd.c b/block/genhd.c
index 54f60ded2ee6..10ff0606c100 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -626,7 +626,6 @@ void del_gendisk(struct gendisk *disk)
blk_mq_freeze_queue_wait(q);
- rq_qos_exit(q);
blk_sync_queue(q);
blk_flush_integrity();
/*
--