On Wed, Mar 17, 2021 at 01:15:42AM -0700, Selvin Xavier wrote: > When L2 driver detects a device crash or device undergone > reset, it invokes a stop callback to recover from error. > Current RoCE driver doesn't recover the device. So move > the device to error state and dispatch fatal events to all qps > Release the MSIx vectors to avoid a crash when L2 driver > disables the MSIx. > Also, check for the device state to avoid posting further > commands to the HW. > > Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@xxxxxxxxxxxx> > Signed-off-by: Devesh Sharma <devesh.sharma@xxxxxxxxxxxx> > Signed-off-by: Selvin Xavier <selvin.xavier@xxxxxxxxxxxx> > --- > v1->v2: > Fix the build warning > Reported-by: kernel test robot <lkp@xxxxxxxxx> Applied to for-next > + bnxt_re_dev_stop(rdev); > + bnxt_re_stop_irq(rdev); > + /* Move the device states to detached and avoid sending any more > + * commands to HW > + */ > + set_bit(BNXT_RE_FLAG_ERR_DEVICE_DETACHED, &rdev->flags); > + set_bit(ERR_DEVICE_DETACHED, &rdev->rcfw.cmdq.flags); But I'm skeptical that all this set_bit stuff without any locks in the driver is sane Jason