The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable@xxxxxxxxxxxxxxx>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y git checkout FETCH_HEAD git cherry-pick -x 29b434d1e49252b3ad56ad3197e47fafff5356a1 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable@xxxxxxxxxxxxxxx>' --in-reply-to '2023081243-sleet-native-6d03@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^.. Possible dependencies: 29b434d1e492 ("nvme-rdma: fix potential unbalanced freeze & unfreeze") 9f27bd701d18 ("nvme: rename the queue quiescing helpers") 91c11d5f3254 ("nvme-rdma: stop auth work after tearing down queues in error recovery") 1f1a4f89562d ("nvme-tcp: stop auth work after tearing down queues in error recovery") eac3ef262941 ("nvme-pci: split the initial probe from the rest path") a6ee7f19ebfd ("nvme-pci: call nvme_pci_configure_admin_queue from nvme_pci_enable") 3f30a79c2e2c ("nvme-pci: set constant paramters in nvme_pci_alloc_ctrl") 2e87570be9d2 ("nvme-pci: factor out a nvme_pci_alloc_dev helper") 081a7d958ce4 ("nvme-pci: factor the iod mempool creation into a helper") 94cc781f69f4 ("nvme: move OPAL setup from PCIe to core") cd50f9b24726 ("nvme: split nvme_kill_queues") 6bcd5089ee13 ("nvme: don't unquiesce the admin queue in nvme_kill_queues") 0ffc7e98bfaa ("nvme-pci: refactor the tagset handling in nvme_reset_work") 71b26083d59c ("block: set the disk capacity to 0 in blk_mark_disk_dead") 6dfba1c09c10 ("nvme-fc: use the tagset alloc/free helpers") 1864ea46155c ("nvme-fc: store the generic nvme_ctrl in set->driver_data") cefa1032f111 ("nvme-rdma: use the tagset alloc/free helpers") 2d60738c8f80 ("nvme-rdma: store the generic nvme_ctrl in set->driver_data") fe60e8c53411 ("nvme: add common helpers to allocate and free tagsets") 61ce339f19fa ("nvme-pci: set min_align_mask before calculating max_hw_sectors") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 29b434d1e49252b3ad56ad3197e47fafff5356a1 Mon Sep 17 00:00:00 2001 From: Ming Lei <ming.lei@xxxxxxxxxx> Date: Tue, 11 Jul 2023 17:40:41 +0800 Subject: [PATCH] nvme-rdma: fix potential unbalanced freeze & unfreeze Move start_freeze into nvme_rdma_configure_io_queues(), and there is at least two benefits: 1) fix unbalanced freeze and unfreeze, since re-connection work may fail or be broken by removal 2) IO during error recovery can be failfast quickly because nvme fabrics unquiesces queues after teardown. One side-effect is that !mpath request may timeout during connecting because of queue topo change, but that looks not one big deal: 1) same problem exists with current code base 2) compared with !mpath, mpath use case is dominant Fixes: 9f98772ba307 ("nvme-rdma: fix controller reset hang during traffic") Cc: stable@xxxxxxxxxxxxxxx Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> Tested-by: Yi Zhang <yi.zhang@xxxxxxxxxx> Reviewed-by: Sagi Grimberg <sagi@xxxxxxxxxxx> Signed-off-by: Keith Busch <kbusch@xxxxxxxxxx> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index d433b2ec07a6..337a624a537c 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -883,6 +883,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new) goto out_cleanup_tagset; if (!new) { + nvme_start_freeze(&ctrl->ctrl); nvme_unquiesce_io_queues(&ctrl->ctrl); if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) { /* @@ -891,6 +892,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new) * to be safe. */ ret = -ENODEV; + nvme_unfreeze(&ctrl->ctrl); goto out_wait_freeze_timed_out; } blk_mq_update_nr_hw_queues(ctrl->ctrl.tagset, @@ -940,7 +942,6 @@ static void nvme_rdma_teardown_io_queues(struct nvme_rdma_ctrl *ctrl, bool remove) { if (ctrl->ctrl.queue_count > 1) { - nvme_start_freeze(&ctrl->ctrl); nvme_quiesce_io_queues(&ctrl->ctrl); nvme_sync_io_queues(&ctrl->ctrl); nvme_rdma_stop_io_queues(ctrl);