This is a note to let you know that I've just added the patch titled RDMA/bnxt_re: Avoid CPU lockups due fifo occupancy check loop to the 6.11-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: rdma-bnxt_re-avoid-cpu-lockups-due-fifo-occupancy-ch.patch and it can be found in the queue-6.11 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. commit 43ffdd84a83dcbe8f037153a2156d4ee8001f8c0 Author: Selvin Xavier <selvin.xavier@xxxxxxxxxxxx> Date: Tue Oct 8 00:41:38 2024 -0700 RDMA/bnxt_re: Avoid CPU lockups due fifo occupancy check loop [ Upstream commit 8be3e5b0c96beeefe9d5486b96575d104d3e7d17 ] Driver waits indefinitely for the fifo occupancy to go below a threshold as soon as the pacing interrupt is received. This can cause soft lockup on one of the processors, if the rate of DB is very high. Add a loop count for FPGA and exit the __wait_for_fifo_occupancy_below_th if the loop is taking more time. Pacing will be continuing until the occupancy is below the threshold. This is ensured by the checks in bnxt_re_pacing_timer_exp and further scheduling the work for pacing based on the fifo occupancy. Fixes: 2ad4e6303a6d ("RDMA/bnxt_re: Implement doorbell pacing algorithm") Link: https://patch.msgid.link/r/1728373302-19530-7-git-send-email-selvin.xavier@xxxxxxxxxxxx Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@xxxxxxxxxxxx> Reviewed-by: Chandramohan Akula <chandramohan.akula@xxxxxxxxxxxx> Signed-off-by: Selvin Xavier <selvin.xavier@xxxxxxxxxxxx> Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c index e06adb2dfe6f9..c905a51aabfba 100644 --- a/drivers/infiniband/hw/bnxt_re/main.c +++ b/drivers/infiniband/hw/bnxt_re/main.c @@ -518,6 +518,7 @@ static bool is_dbr_fifo_full(struct bnxt_re_dev *rdev) static void __wait_for_fifo_occupancy_below_th(struct bnxt_re_dev *rdev) { struct bnxt_qplib_db_pacing_data *pacing_data = rdev->qplib_res.pacing_data; + u32 retry_fifo_check = 1000; u32 fifo_occup; /* loop shouldn't run infintely as the occupancy usually goes @@ -531,6 +532,14 @@ static void __wait_for_fifo_occupancy_below_th(struct bnxt_re_dev *rdev) if (fifo_occup < pacing_data->pacing_th) break; + if (!retry_fifo_check--) { + dev_info_once(rdev_to_dev(rdev), + "%s: fifo_occup = 0x%xfifo_max_depth = 0x%x pacing_th = 0x%x\n", + __func__, fifo_occup, pacing_data->fifo_max_depth, + pacing_data->pacing_th); + break; + } + } }