On Sat, Mar 04, 2023 at 11:45:26AM -0600, Bob Pearson wrote: > This patch series corrects qp reference counting issues > related to deferred execution of tasklets. These issues were > discovered in attempting to resolve soft lockups of the rxe > driver observed by Daisuke Matsuda in a version of the driver > using work queues where the workqueue implementation was based > on the current tasklet based driver. An attempt to find the > root cause of those lockups lead to an error in the tasklet > implementation that has been present since the driver went > upstream. This patch series corrects that error. > > With this patch series applied the rxe driver is more stable and > has run the test cases reported by Matsuda for over 24 hours without > errors. > > The series also corrects some errors in qp reference counting > related to qp cleanup. > > This series depends on the RDMA/rxe: Add error logging to rxe" > series as a prerequisite. > > Link: https://lore.kernel.org/linux-rdma/TYCPR01MB845522FD536170D75068DD41E5099@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ > Signed-off-by: Bob Pearson <rpearsonhpe@xxxxxxxxx> > > v3: > Fixed an error in patch 4/8 "RDMA/rxe: Cleanup error state handling in > rxe_comp.c". Didn't set wqe.status to IB_WC_WR_FLUSH_ERR when > flushing send queue. This broke blktests which calls modify qp to > set qp to IB_QPS_ERR and waits for the flushed cqe's. > > v2: > This version of this series split off the changes to rxe debug code > which have been submitted as "RDMA/rxe: Add error logging to rxe". > One unrelated patch was dropped and other patches earlier included > in a series to convert from tasklets to workqueues were moved into > this series because they are relevant both for the tasklet version > and the workqueue version of the driver. > > Bob Pearson (8): > RDMA/rxe: Convert tasklet args to queue pairs > RDMA/rxe: Cleanup reset state handling in rxe_resp.c > RDMA/rxe: Cleanup error state handling in rxe_comp.c > RDMA/rxe: Remove qp reference counting in tasks > RDMA/rxe: Remove __rxe_do_task() > RDMA/rxe: Make tasks schedule each other > RDMA/rxe: Rewrite rxe_task.c Applied to for-next > RDMA/rxe: Warn if refcnt zero in rxe_put This one I dropped Thanks, Jason