Patch "RDMA/rxe: Fix deadlock in rxe_do_local_ops()" has been added to the 5.18-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    RDMA/rxe: Fix deadlock in rxe_do_local_ops()

to the 5.18-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     rdma-rxe-fix-deadlock-in-rxe_do_local_ops.patch
and it can be found in the queue-5.18 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit ec6d39b1ca3dc2faf71d2fc05202ad7229df85f3
Author: Bob Pearson <rpearsonhpe@xxxxxxxxx>
Date:   Mon May 23 17:32:52 2022 -0500

    RDMA/rxe: Fix deadlock in rxe_do_local_ops()
    
    [ Upstream commit 7cb33d1bc1ac8e51fd88928f96674d392f8e07c4 ]
    
    When a local operation (invalidate mr, reg mr, bind mw) is finished there
    will be no ack packet coming from a responder to cause the wqe to be
    completed. This may happen anyway if a subsequent wqe performs
    IO. Currently if the wqe is signalled the completer tasklet is scheduled
    immediately but not otherwise.
    
    This leads to a deadlock if the next wqe has the fence bit set in send
    flags and the operation is not signalled. This patch removes the condition
    that the wqe must be signalled in order to schedule the completer tasklet
    which is the simplest fix for this deadlock and is fairly low cost. This
    is the analog for local operations of always setting the ackreq bit in all
    last or only request packets even if the operation is not signalled.
    
    Link: https://lore.kernel.org/r/20220523223251.15350-1-rpearsonhpe@xxxxxxxxx
    Reported-by: Jenny Hack <jhack@xxxxxxx>
    Fixes: c1a411268a4b ("RDMA/rxe: Move local ops to subroutine")
    Signed-off-by: Bob Pearson <rpearsonhpe@xxxxxxxxx>
    Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
index 8a1cff80a68e..d574c47099b8 100644
--- a/drivers/infiniband/sw/rxe/rxe_req.c
+++ b/drivers/infiniband/sw/rxe/rxe_req.c
@@ -586,9 +586,11 @@ static int rxe_do_local_ops(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
 	wqe->status = IB_WC_SUCCESS;
 	qp->req.wqe_index = queue_next_index(qp->sq.queue, qp->req.wqe_index);
 
-	if ((wqe->wr.send_flags & IB_SEND_SIGNALED) ||
-	    qp->sq_sig_type == IB_SIGNAL_ALL_WR)
-		rxe_run_task(&qp->comp.task, 1);
+	/* There is no ack coming for local work requests
+	 * which can lead to a deadlock. So go ahead and complete
+	 * it now.
+	 */
+	rxe_run_task(&qp->comp.task, 1);
 
 	return 0;
 }



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux