This is a note to let you know that I've just added the patch titled RDMA/irdma: Fix drain SQ hang with no completion to the 5.15-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: rdma-irdma-fix-drain-sq-hang-with-no-completion.patch and it can be found in the queue-5.15 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From ead54ced6321099978d30d62dc49c282a6e70574 Mon Sep 17 00:00:00 2001 From: Shiraz Saleem <shiraz.saleem@xxxxxxxxx> Date: Wed, 24 Aug 2022 10:43:59 -0500 Subject: RDMA/irdma: Fix drain SQ hang with no completion From: Shiraz Saleem <shiraz.saleem@xxxxxxxxx> commit ead54ced6321099978d30d62dc49c282a6e70574 upstream. SW generated completions for outstanding WRs posted on SQ after QP is in error target the wrong CQ. This causes the ib_drain_sq to hang with no completion. Fix this to generate completions on the right CQ. [ 863.969340] INFO: task kworker/u52:2:671 blocked for more than 122 seconds. [ 863.979224] Not tainted 5.14.0-130.el9.x86_64 #1 [ 863.986588] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 863.996997] task:kworker/u52:2 state:D stack: 0 pid: 671 ppid: 2 flags:0x00004000 [ 864.007272] Workqueue: xprtiod xprt_autoclose [sunrpc] [ 864.014056] Call Trace: [ 864.017575] __schedule+0x206/0x580 [ 864.022296] schedule+0x43/0xa0 [ 864.026736] schedule_timeout+0x115/0x150 [ 864.032185] __wait_for_common+0x93/0x1d0 [ 864.037717] ? usleep_range_state+0x90/0x90 [ 864.043368] __ib_drain_sq+0xf6/0x170 [ib_core] [ 864.049371] ? __rdma_block_iter_next+0x80/0x80 [ib_core] [ 864.056240] ib_drain_sq+0x66/0x70 [ib_core] [ 864.062003] rpcrdma_xprt_disconnect+0x82/0x3b0 [rpcrdma] [ 864.069365] ? xprt_prepare_transmit+0x5d/0xc0 [sunrpc] [ 864.076386] xprt_rdma_close+0xe/0x30 [rpcrdma] [ 864.082593] xprt_autoclose+0x52/0x100 [sunrpc] [ 864.088718] process_one_work+0x1e8/0x3c0 [ 864.094170] worker_thread+0x50/0x3b0 [ 864.099109] ? rescuer_thread+0x370/0x370 [ 864.104473] kthread+0x149/0x170 [ 864.109022] ? set_kthread_struct+0x40/0x40 [ 864.114713] ret_from_fork+0x22/0x30 Fixes: 81091d7696ae ("RDMA/irdma: Add SW mechanism to generate completions on error") Link: https://lore.kernel.org/r/20220824154358.117-1-shiraz.saleem@xxxxxxxxx Reported-by: Kamal Heib <kamalheib1@xxxxxxxxx> Signed-off-by: Shiraz Saleem <shiraz.saleem@xxxxxxxxx> Signed-off-by: Leon Romanovsky <leon@xxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- drivers/infiniband/hw/irdma/utils.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/infiniband/hw/irdma/utils.c +++ b/drivers/infiniband/hw/irdma/utils.c @@ -2660,7 +2660,7 @@ void irdma_generate_flush_completions(st spin_unlock_irqrestore(&iwqp->lock, flags2); spin_unlock_irqrestore(&iwqp->iwscq->lock, flags1); if (compl_generated) - irdma_comp_handler(iwqp->iwrcq); + irdma_comp_handler(iwqp->iwscq); } else { spin_unlock_irqrestore(&iwqp->iwscq->lock, flags1); mod_delayed_work(iwqp->iwdev->cleanup_wq, &iwqp->dwork_flush, Patches currently in stable-queue which might be from shiraz.saleem@xxxxxxxxx are queue-5.15/rdma-irdma-fix-drain-sq-hang-with-no-completion.patch queue-5.15/rdma-irdma-do-not-generate-sw-completions-for-nops.patch queue-5.15/rdma-irdma-fix-local-invalidate-fencing.patch queue-5.15/rdma-irdma-add-sw-mechanism-to-generate-completions-.patch queue-5.15/rdma-irdma-prevent-qp-use-after-free.patch