On Tue, Nov 12, 2024 at 03:49:56PM +0200, Mohammad Heib wrote: > If bnxt FW behaves unexpectedly because of FW bug or unexpected behavior it > can send completions for old cookies that have already been handled by the > bnxt driver. If that old cookie was associated with an old calling context > the driver will try to access that caller memory again because the driver > never clean the is_waiter_alive flag after the caller successfully complete > waiting, and this access will cause the following kernel panic: > > Call Trace: > <IRQ> > ? __die+0x20/0x70 > ? page_fault_oops+0x75/0x170 > ? exc_page_fault+0xaa/0x140 > ? asm_exc_page_fault+0x22/0x30 > ? bnxt_qplib_process_qp_event.isra.0+0x20c/0x3a0 [bnxt_re] > ? srso_return_thunk+0x5/0x5f > ? __wake_up_common+0x78/0xa0 > ? srso_return_thunk+0x5/0x5f > bnxt_qplib_service_creq+0x18d/0x250 [bnxt_re] > tasklet_action_common+0xac/0x210 > handle_softirqs+0xd3/0x2b0 > __irq_exit_rcu+0x9b/0xc0 > common_interrupt+0x7f/0xa0 > </IRQ> > <TASK> > > To avoid the above unexpected behavior clear the is_waiter_alive flag > every time the caller finishes waiting for a completion. > > Fixes: 691eb7c6110f ("RDMA/bnxt_re: handle command completions after driver detect a timedout") > Signed-off-by: Mohammad Heib <mheib@xxxxxxxxxx> > --- > drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 16 ++++++++-------- > 1 file changed, 8 insertions(+), 8 deletions(-) Selvin?