On Wed, Jul 31, 2019 at 09:01:24PM +0300, Leon Romanovsky wrote: > On Wed, Jul 31, 2019 at 05:22:19PM +0000, Jason Gunthorpe wrote: > > On Wed, Jul 31, 2019 at 08:09:44PM +0300, Leon Romanovsky wrote: > > > On Wed, Jul 31, 2019 at 05:00:59PM +0000, Jason Gunthorpe wrote: > > > > On Wed, Jul 31, 2019 at 12:22:44PM -0400, Doug Ledford wrote: > > > > > > diff --git a/drivers/infiniband/hw/mlx5/main.c > > > > > > b/drivers/infiniband/hw/mlx5/main.c > > > > > > index c2a5780cb394..e12a4404096b 100644 > > > > > > +++ b/drivers/infiniband/hw/mlx5/main.c > > > > > > @@ -5802,13 +5802,12 @@ static void mlx5_ib_unbind_slave_port(struct > > > > > > mlx5_ib_dev *ibdev, > > > > > > return; > > > > > > } > > > > > > > > > > > > - if (mpi->mdev_events.notifier_call) > > > > > > - mlx5_notifier_unregister(mpi->mdev, &mpi->mdev_events); > > > > > > - mpi->mdev_events.notifier_call = NULL; > > > > > > - > > > > > > mpi->ibdev = NULL; > > > > > > > > > > > > spin_unlock(&port->mp.mpi_lock); > > > > > > + if (mpi->mdev_events.notifier_call) > > > > > > + mlx5_notifier_unregister(mpi->mdev, &mpi->mdev_events); > > > > > > + mpi->mdev_events.notifier_call = NULL; > > > > > > > > > > I can see where this fixes the problem at hand, but this gives the > > > > > appearance of creating a new race. Doing a check/unregister/set-null > > > > > series outside of any locks is a red flag to someone investigating the > > > > > code. You should at least make note of the fact that calling unregister > > > > > more than once is safe. If you're fine with it, I can add a comment and > > > > > take the patch, or you can resubmit. > > > > > > > > Mucking about notifier_call like that is gross anyhow, maybe better to > > > > delete it entirely. > > > > > > What do you propose to delete? > > > > The 'mpi->mdev_events.notifier_call = NULL;' and 'if > > (mpi->mdev_events.notifier_call)' > > > > Once it leaves the lock it stops doing anything useful. > > > > If you need it, then we can't drop the lock, if you don't, it is just > > dead code, delete it. > > This specific notifier_call is protected outside > of mlx5_ib_unbind_slave_port() by mlx5_ib_multiport_mutex and NULL check > is needed to ensure single call to mlx5_notifier_unregister, because > calls to mlx5_ib_unbind_slave_port() will be serialized. If this routine is now relying on locking that is not obvious in the function itself then add a lockdep too. Jason