Hello Ming and Christoph Issue with Infinibox storage ---------------------------- Really discovered 2 issues here Issue 1 Kernels 5.15 to 5.18 inclusive recognize the discard support on the Infinibox device but they fail in the nvme_setup_discard function call [ 339.591118] ------------[ cut here ]------------ [ 339.591134] WARNING: CPU: 3 PID: 32 at drivers/nvme/host/core.c:868 nvme_setup_discard+0x16e/0x1e0 [nvme_core] [ 339.591349] CPU: 3 PID: 32 Comm: kworker/3:0H Not tainted 5.15.0 #1 [ 339.591404] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 [ 339.591423] Workqueue: kblockd blk_mq_run_work_fn [ 339.591458] RIP: 0010:nvme_setup_discard+0x16e/0x1e0 [nvme_core] [ 339.591475] Code: 38 48 8b b8 48 0b 00 00 48 2b 3d 2d 69 43 d3 48 c1 ff 06 48 c1 e7 0c 48 03 3d 2e 69 43 d3 48 89 f8 48 85 f6 0f 85 dd fe ff ff <0f> 0b ba 00 00 00 80 48 01 d7 72 52 48 c7 c2 00 00 00 80 48 2b 15 [ 339.591505] RSP: 0018:ffffbacb0052fcf8 EFLAGS: 00010212 [ 339.591516] RAX: ffff93798b67e000 RBX: ffff937994565780 RCX: ffff937a0b67e000 [ 339.591529] RDX: 0000000000000020 RSI: 0000000000000000 RDI: ffff93798b67e000 [ 339.591541] RBP: ffff93799452f1b0 R08: ffff93798b67e000 R09: 00000000014000c0 [ 339.591553] R10: 0000000000000800 R11: 0000000000000000 R12: ffff9379a0df1000 [ 339.591566] R13: 0000000000000001 R14: ffffbacb0052fde0 R15: ffff9379a0df1000 [ 339.591578] FS: 0000000000000000(0000) GS:ffff9379b9ec0000(0000) knlGS:0000000000000000 [ 339.591602] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 339.591617] CR2: 00007f4b7792f000 CR3: 000000010dcf2003 CR4: 0000000000770ee0 [ 339.591641] PKRU: 55555554 [ 339.591648] Call Trace: [ 339.591656] nvme_setup_cmd+0xac/0x650 [nvme_core] [ 339.591673] nvme_tcp_queue_rq+0x6a/0x390 [nvme_tcp] [ 339.591685] blk_mq_dispatch_rq_list+0x139/0x810 [ 339.591698] ? blk_mq_flush_busy_ctxs+0xf9/0x120 [ 339.591708] __blk_mq_sched_dispatch_requests+0x135/0x140 [ 339.591720] blk_mq_sched_dispatch_requests+0x30/0x60 [ 339.591746] __blk_mq_run_hw_queue+0x2b/0x60 [ 339.591757] process_one_work+0x1cb/0x370 [ 339.592339] worker_thread+0x30/0x380 [ 339.593200] ? process_one_work+0x370/0x370 [ 339.593990] kthread+0x118/0x140 [ 339.594710] ? set_kthread_struct+0x40/0x40 [ 339.595267] ret_from_fork+0x1f/0x30 [ 339.596077] ---[ end trace 547450bc9931a628 ]--- [ 339.596806] blk_update_request: I/O error, dev nvme1c1n1, sector 20971712 op 0x3:(DISCARD) flags 0x2004000 phys_seg 1 prio class 0 [ 339.741735] blk_update_request: I/O error, dev nvme1c1n1, sector 21037248 op 0x3:(DISCARD) flags 0x2004000 phys_seg 1 prio class 0 [ 339.743952] blk_update_request: I/O error, dev nvme1c1n1, sector 21102784 op 0x3:(DISCARD) flags 0x2004000 phys_seg 1 prio class 0 [ 339.745480] blk_update_request: I/O error, dev nvme1c1n1, sector 21168320 op 0x3:(DISCARD) flags 0x2004000 phys_seg 1 prio class 0 [ 339.746425] blk_update_request: I/O error, dev nvme1c1n1, sector 21233856 op 0x3:(DISCARD) flags 0x2004000 phys_seg 1 prio class 0 [ 339.747150] blk_update_request: I/O error, dev nvme1c1n1, sector 21299392 op 0x3:(DISCARD) flags 0x2004000 phys_seg 1 prio class 0 [ 339.747948] blk_update_request: I/O error, dev nvme1c1n1, sector 21364928 op 0x3:(DISCARD) flags 0x2000000 phys_seg 1 prio class 0 Issue 2 Trying to narrow this down. 5.19 and higher (6.3 included), no longer support discard on the Infinibox device and log this message so I cannot run the test for the discard issue [ 35.989809] nvme nvme1: new ctrl: NQN "nqn.2020- 01.com.infinidat:36000-subsystem-696", addr 192.168.1.2:4420 [ 64.810437] XFS (nvme1n1): mounting with "discard" option, but the device does not support discard [ 64.812298] XFS (nvme1n1): Mounting V5 Filesystem 6763a33f-18cc- 4a26-894b-8b0f8d79a98a I then bisected between 5.18 and 5.19 to this commit 1a86924e4f464757546d7f7bdc469be237918395 is the first bad commit commit 1a86924e4f464757546d7f7bdc469be237918395 Author: Tom Yan <tom.ty89@xxxxxxxxx> Date: Fri Apr 29 12:52:43 2022 +0800 nvme: fix interpretation of DMRSL DMRSLl is in the unit of logical blocks, while max_discard_sectors is in the unit of "linux sector". Signed-off-by: Tom Yan <tom.ty89@xxxxxxxxx> Signed-off-by: Christoph Hellwig <hch@xxxxxx> drivers/nvme/host/core.c | 6 ++++-- drivers/nvme/host/nvme.h | 1 + 2 files changed, 5 insertions(+), 2 deletions(-) Note that Infindat mentioned this in our case they logged with us They say they fully adhere to TP4040 MDTS. Towards NVMe-oF 2.0 specification, TP4040 - Max Data Transfer for non- IO Commands (MDTS) was released with additional fields to control these parameters. These parameters are supported in kernel versions 5.15 and above. **** Our storage target will reply with 0 for bit 2 of the ONCS, indicating UNMAP is supported based on the DMRL, DMRSL, and DMSL values. (older kernels will interpret these values as UNMAP NOT SUPPORTED) Let me know your thoughts please. for both issues Regards Laurence Oberman