This is a note to let you know that I've just added the patch titled nvme-rdma: fix potential unbalanced freeze & unfreeze to the 6.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: nvme-rdma-fix-potential-unbalanced-freeze-unfreeze.patch and it can be found in the queue-6.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From 29b434d1e49252b3ad56ad3197e47fafff5356a1 Mon Sep 17 00:00:00 2001 From: Ming Lei <ming.lei@xxxxxxxxxx> Date: Tue, 11 Jul 2023 17:40:41 +0800 Subject: nvme-rdma: fix potential unbalanced freeze & unfreeze From: Ming Lei <ming.lei@xxxxxxxxxx> commit 29b434d1e49252b3ad56ad3197e47fafff5356a1 upstream. Move start_freeze into nvme_rdma_configure_io_queues(), and there is at least two benefits: 1) fix unbalanced freeze and unfreeze, since re-connection work may fail or be broken by removal 2) IO during error recovery can be failfast quickly because nvme fabrics unquiesces queues after teardown. One side-effect is that !mpath request may timeout during connecting because of queue topo change, but that looks not one big deal: 1) same problem exists with current code base 2) compared with !mpath, mpath use case is dominant Fixes: 9f98772ba307 ("nvme-rdma: fix controller reset hang during traffic") Cc: stable@xxxxxxxxxxxxxxx Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> Tested-by: Yi Zhang <yi.zhang@xxxxxxxxxx> Reviewed-by: Sagi Grimberg <sagi@xxxxxxxxxxx> Signed-off-by: Keith Busch <kbusch@xxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- drivers/nvme/host/rdma.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -918,6 +918,7 @@ static int nvme_rdma_configure_io_queues goto out_cleanup_tagset; if (!new) { + nvme_start_freeze(&ctrl->ctrl); nvme_unquiesce_io_queues(&ctrl->ctrl); if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) { /* @@ -926,6 +927,7 @@ static int nvme_rdma_configure_io_queues * to be safe. */ ret = -ENODEV; + nvme_unfreeze(&ctrl->ctrl); goto out_wait_freeze_timed_out; } blk_mq_update_nr_hw_queues(ctrl->ctrl.tagset, @@ -975,7 +977,6 @@ static void nvme_rdma_teardown_io_queues bool remove) { if (ctrl->ctrl.queue_count > 1) { - nvme_start_freeze(&ctrl->ctrl); nvme_quiesce_io_queues(&ctrl->ctrl); nvme_sync_io_queues(&ctrl->ctrl); nvme_rdma_stop_io_queues(ctrl); Patches currently in stable-queue which might be from ming.lei@xxxxxxxxxx are queue-6.4/nvme-tcp-fix-potential-unbalanced-freeze-unfreeze.patch queue-6.4/nvme-fix-possible-hang-when-removing-a-controller-during-error-recovery.patch queue-6.4/nvme-rdma-fix-potential-unbalanced-freeze-unfreeze.patch