On 12/11/21 5:01 AM, Yi Zhang wrote:
On Fri, Jun 25, 2021 at 12:14 AM Yi Zhang <yi.zhang@xxxxxxxxxx> wrote:
On Thu, Jun 24, 2021 at 5:32 AM Sagi Grimberg <sagi@xxxxxxxxxxx> wrote:
Hello
Gentle ping here, this issue still exists on latest 5.13-rc7
# time nvme reset /dev/nvme0
real 0m12.636s
user 0m0.002s
sys 0m0.005s
# time nvme reset /dev/nvme0
real 0m12.641s
user 0m0.000s
sys 0m0.007s
Strange that even normal resets take so long...
What device are you using?
Hi Sagi
Here is the device info:
Mellanox Technologies MT27700 Family [ConnectX-4]
# time nvme reset /dev/nvme0
real 1m16.133s
user 0m0.000s
sys 0m0.007s
There seems to be a spurious command timeout here, but maybe this
is due to the fact that the queues take so long to connect and
the target expires the keep-alive timer.
Does this patch help?
The issue still exists, let me know if you need more testing for it. :)
Hi Sagi
ping, this issue still can be reproduced on the latest
linux-block/for-next, do you have a chance to recheck it, thanks.
Can you check if it happens with the below patch:
--
diff --git a/drivers/nvme/target/fabrics-cmd.c
b/drivers/nvme/target/fabrics-cmd.c
index f91a56180d3d..6e5aadfb07a0 100644
--- a/drivers/nvme/target/fabrics-cmd.c
+++ b/drivers/nvme/target/fabrics-cmd.c
@@ -191,6 +191,14 @@ static u16 nvmet_install_queue(struct nvmet_ctrl
*ctrl, struct nvmet_req *req)
}
}
+ /*
+ * Controller establishment flow may take some time, and the
host may not
+ * send us keep-alive during this period, hence reset the
+ * traffic based keep-alive timer so we don't trigger a
+ * controller teardown as a result of a keep-alive expiration.
+ */
+ ctrl->reset_tbkas = true;
+
return 0;
err:
--