Issue with discard with NVME and Infinibox Storage

Laurence Oberman <loberman@xxxxxxxxxx> · Mon, 03 Apr 2023 13:35:22 -0400

Hello Ming and Christoph

Issue with Infinibox storage
----------------------------
Really discovered 2 issues here 

Issue 1
Kernels 5.15 to 5.18 inclusive recognize the discard support on the
Infinibox device but they fail in the nvme_setup_discard function call

[  339.591118] ------------[ cut here ]------------
[  339.591134] WARNING: CPU: 3 PID: 32 at drivers/nvme/host/core.c:868
nvme_setup_discard+0x16e/0x1e0 [nvme_core]

[  339.591349] CPU: 3 PID: 32 Comm: kworker/3:0H Not tainted 5.15.0 #1
[  339.591404] Hardware name: VMware, Inc. VMware Virtual
Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[  339.591423] Workqueue: kblockd blk_mq_run_work_fn
[  339.591458] RIP: 0010:nvme_setup_discard+0x16e/0x1e0 [nvme_core]
[  339.591475] Code: 38 48 8b b8 48 0b 00 00 48 2b 3d 2d 69 43 d3 48 c1
ff 06 48 c1 e7 0c 48 03 3d 2e 69 43 d3 48 89 f8 48 85 f6 0f 85 dd fe ff
ff <0f> 0b ba 00 00 00 80 48 01 d7 72 52 48 c7 c2 00 00 00 80 48 2b 15
[  339.591505] RSP: 0018:ffffbacb0052fcf8 EFLAGS: 00010212
[  339.591516] RAX: ffff93798b67e000 RBX: ffff937994565780 RCX:
ffff937a0b67e000
[  339.591529] RDX: 0000000000000020 RSI: 0000000000000000 RDI:
ffff93798b67e000
[  339.591541] RBP: ffff93799452f1b0 R08: ffff93798b67e000 R09:
00000000014000c0
[  339.591553] R10: 0000000000000800 R11: 0000000000000000 R12:
ffff9379a0df1000
[  339.591566] R13: 0000000000000001 R14: ffffbacb0052fde0 R15:
ffff9379a0df1000
[  339.591578] FS:  0000000000000000(0000) GS:ffff9379b9ec0000(0000)
knlGS:0000000000000000
[  339.591602] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  339.591617] CR2: 00007f4b7792f000 CR3: 000000010dcf2003 CR4:
0000000000770ee0
[  339.591641] PKRU: 55555554
[  339.591648] Call Trace:
[  339.591656]  nvme_setup_cmd+0xac/0x650 [nvme_core]
[  339.591673]  nvme_tcp_queue_rq+0x6a/0x390 [nvme_tcp]
[  339.591685]  blk_mq_dispatch_rq_list+0x139/0x810
[  339.591698]  ? blk_mq_flush_busy_ctxs+0xf9/0x120
[  339.591708]  __blk_mq_sched_dispatch_requests+0x135/0x140
[  339.591720]  blk_mq_sched_dispatch_requests+0x30/0x60
[  339.591746]  __blk_mq_run_hw_queue+0x2b/0x60
[  339.591757]  process_one_work+0x1cb/0x370
[  339.592339]  worker_thread+0x30/0x380
[  339.593200]  ? process_one_work+0x370/0x370
[  339.593990]  kthread+0x118/0x140
[  339.594710]  ? set_kthread_struct+0x40/0x40
[  339.595267]  ret_from_fork+0x1f/0x30
[  339.596077] ---[ end trace 547450bc9931a628 ]---
[  339.596806] blk_update_request: I/O error, dev nvme1c1n1, sector
20971712 op 0x3:(DISCARD) flags 0x2004000 phys_seg 1 prio class 0
[  339.741735] blk_update_request: I/O error, dev nvme1c1n1, sector
21037248 op 0x3:(DISCARD) flags 0x2004000 phys_seg 1 prio class 0
[  339.743952] blk_update_request: I/O error, dev nvme1c1n1, sector
21102784 op 0x3:(DISCARD) flags 0x2004000 phys_seg 1 prio class 0
[  339.745480] blk_update_request: I/O error, dev nvme1c1n1, sector
21168320 op 0x3:(DISCARD) flags 0x2004000 phys_seg 1 prio class 0
[  339.746425] blk_update_request: I/O error, dev nvme1c1n1, sector
21233856 op 0x3:(DISCARD) flags 0x2004000 phys_seg 1 prio class 0
[  339.747150] blk_update_request: I/O error, dev nvme1c1n1, sector
21299392 op 0x3:(DISCARD) flags 0x2004000 phys_seg 1 prio class 0
[  339.747948] blk_update_request: I/O error, dev nvme1c1n1, sector
21364928 op 0x3:(DISCARD) flags 0x2000000 phys_seg 1 prio class 0

Issue 2
Trying to narrow this down.
5.19 and higher (6.3 included), no longer support discard on the
Infinibox device and log this message so I cannot run the test for the
discard issue

[   35.989809] nvme nvme1: new ctrl: NQN "nqn.2020-
01.com.infinidat:36000-subsystem-696", addr 192.168.1.2:4420
[   64.810437] XFS (nvme1n1): mounting with "discard" option, but the
device does not support discard
[   64.812298] XFS (nvme1n1): Mounting V5 Filesystem 6763a33f-18cc-
4a26-894b-8b0f8d79a98a

I then bisected between 5.18 and 5.19 to this commit

1a86924e4f464757546d7f7bdc469be237918395 is the first bad commit
commit 1a86924e4f464757546d7f7bdc469be237918395
Author: Tom Yan <tom.ty89@xxxxxxxxx>
Date:   Fri Apr 29 12:52:43 2022 +0800

    nvme: fix interpretation of DMRSL

    DMRSLl is in the unit of logical blocks, while max_discard_sectors
is
    in the unit of "linux sector".

    Signed-off-by: Tom Yan <tom.ty89@xxxxxxxxx>
    Signed-off-by: Christoph Hellwig <hch@xxxxxx>

 drivers/nvme/host/core.c | 6 ++++--
 drivers/nvme/host/nvme.h | 1 +
 2 files changed, 5 insertions(+), 2 deletions(-)

Note that Infindat mentioned this in our case they logged with us
They say they fully adhere to TP4040 MDTS.
Towards NVMe-oF 2.0 specification, TP4040  - Max Data Transfer for non-
IO Commands (MDTS) was released with additional fields to control these
parameters.
These parameters are supported in kernel versions 5.15 and above.  ****

Our storage target will reply with 0 for bit 2 of the ONCS, indicating
UNMAP is supported based on the DMRL, DMRSL, and DMSL values. 
(older kernels will interpret these values as UNMAP NOT SUPPORTED)

Let me know your thoughts please. for both issues

Regards
Laurence Oberman