On Wed, Sep 30, 2020 at 10:20:07AM +0300, Leon Romanovsky wrote: > From: Jason Gunthorpe <jgg@xxxxxxxxxx> > > This three thread race can result in the work being run once the callback > becomes NULL: > > CPU1 CPU2 CPU3 > netevent_callback() > process_one_req() rdma_addr_cancel() > [..] > spin_lock_bh() > set_timeout() > spin_unlock_bh() > > spin_lock_bh() > list_del_init(&req->list); > spin_unlock_bh() > > req->callback = NULL > spin_lock_bh() > if (!list_empty(&req->list)) > // Skipped! > // cancel_delayed_work(&req->work); > spin_unlock_bh() > > process_one_req() // again > req->callback() // BOOM > cancel_delayed_work_sync() > > The solution is to always cancel the work once it is completed so any > in between set_timeout() does not result in it running again. > > Fixes: 44e75052bc2a ("RDMA/rdma_cm: Make rdma_addr_cancel into a fence") > Reported-by: Dan Aloni <dan@xxxxxxxxxxxx> > Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx> > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxx> > --- > drivers/infiniband/core/addr.c | 11 +++++------ > 1 file changed, 5 insertions(+), 6 deletions(-) Applied to for-next Jason