This is a note to let you know that I've just added the patch titled RDMA/bnxt_re: Fix the handling of control path response data to the 6.5-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: rdma-bnxt_re-fix-the-handling-of-control-path-response-data.patch and it can be found in the queue-6.5 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From 9fc5f9a92fe6897dbed7b9295b234cb7e3cc9d11 Mon Sep 17 00:00:00 2001 From: Selvin Xavier <selvin.xavier@xxxxxxxxxxxx> Date: Wed, 20 Sep 2023 01:41:19 -0700 Subject: RDMA/bnxt_re: Fix the handling of control path response data From: Selvin Xavier <selvin.xavier@xxxxxxxxxxxx> commit 9fc5f9a92fe6897dbed7b9295b234cb7e3cc9d11 upstream. Flag that indicate control path command completion should be cleared only after copying the command response data. As soon as the is_in_used flag is clear, the waiting thread can proceed with wrong response data. This wrong data is causing multiple issues like wrong lkey used in data traffic and wrong AH Id etc. Use a memory barrier to ensure that the response data is copied and visible to the process waiting on a different cpu core before clearing the is_in_used flag. Clear the is_in_used after copying the command response. Fixes: bcfee4ce3e01 ("RDMA/bnxt_re: remove redundant cmdq_bitmap") Signed-off-by: Saravanan Vajravel <saravanan.vajravel@xxxxxxxxxxxx> Signed-off-by: Selvin Xavier <selvin.xavier@xxxxxxxxxxxx> Link: https://lore.kernel.org/r/1695199280-13520-2-git-send-email-selvin.xavier@xxxxxxxxxxxx Signed-off-by: Leon Romanovsky <leon@xxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) --- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c @@ -664,7 +664,6 @@ static int bnxt_qplib_process_qp_event(s blocked = cookie & RCFW_CMD_IS_BLOCKING; cookie &= RCFW_MAX_COOKIE_VALUE; crsqe = &rcfw->crsqe_tbl[cookie]; - crsqe->is_in_used = false; if (WARN_ONCE(test_bit(FIRMWARE_STALL_DETECTED, &rcfw->cmdq.flags), @@ -680,8 +679,14 @@ static int bnxt_qplib_process_qp_event(s atomic_dec(&rcfw->timeout_send); if (crsqe->is_waiter_alive) { - if (crsqe->resp) + if (crsqe->resp) { memcpy(crsqe->resp, qp_event, sizeof(*qp_event)); + /* Insert write memory barrier to ensure that + * response data is copied before clearing the + * flags + */ + smp_wmb(); + } if (!blocked) wait_cmds++; } @@ -693,6 +698,8 @@ static int bnxt_qplib_process_qp_event(s if (!is_waiter_alive) crsqe->resp = NULL; + crsqe->is_in_used = false; + hwq->cons += req_size; /* This is a case to handle below scenario - Patches currently in stable-queue which might be from selvin.xavier@xxxxxxxxxxxx are queue-6.5/rdma-bnxt_re-fix-the-handling-of-control-path-response-data.patch