> -----Original Message----- > From: linux-rdma-owner@xxxxxxxxxxxxxxx [mailto:linux-rdma-owner@xxxxxxxxxxxxxxx] On > Behalf Of Doug Ledford > Sent: Wednesday, December 10, 2014 11:47 AM > To: linux-rdma@xxxxxxxxxxxxxxx; roland@xxxxxxxxxx > Cc: Doug Ledford > Subject: [PATCH 04/10] IPoIB: fix mcast_dev_flush/mcast_restart_task race > > Our mcast_dev_flush routine and our mcast_restart_task can race against > each other. In particular, they both hold the priv->lock while > manipulating the rbtree and while removing mcast entries from the > multicast_list and while adding entries to the remove_list, but they > also both drop their locks prior to doing the actual removes. The > mcast_dev_flush routine is run entirely under the rtnl lock and so has > at least some locking. The actual race condition is like this: > > Thread 1 Thread 2 > ifconfig ib0 up > start multicast join for broadcast > multicast join completes for broadcast > start to add more multicast joins > call mcast_restart_task to add new entries > ifconfig ib0 down > mcast_dev_flush > mcast_leave(mcast A) > mcast_leave(mcast A) > > As mcast_leave calls ib_sa_multicast_leave, and as member in > core/multicast.c is ref counted, we run into an unbalanced refcount > issue. To avoid stomping on each others removes, take the rtnl lock > specifically when we are deleting the entries from the remove list. > > Signed-off-by: Doug Ledford <dledford@xxxxxxxxxx> Acked-by: Alex Estrin <alex.estrin@xxxxxxxxx> > --- > drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 37 ++++++++++++++++++++++---- > 1 file changed, 32 insertions(+), 5 deletions(-) > > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c > b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c > index a52c9f3f7e4..41325960e4e 100644 > --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c > +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c > @@ -802,7 +802,10 @@ void ipoib_mcast_dev_flush(struct net_device *dev) > > spin_unlock_irqrestore(&priv->lock, flags); > > - /* seperate between the wait to the leave*/ > + /* > + * make sure the in-flight joins have finished before we attempt > + * to leave > + */ > list_for_each_entry_safe(mcast, tmcast, &remove_list, list) > if (test_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags)) > wait_for_completion(&mcast->done); > @@ -923,14 +926,38 @@ void ipoib_mcast_restart_task(struct work_struct *work) > netif_addr_unlock(dev); > local_irq_restore(flags); > > - /* We have to cancel outside of the spinlock */ > + /* > + * make sure the in-flight joins have finished before we attempt > + * to leave > + */ > + list_for_each_entry_safe(mcast, tmcast, &remove_list, list) > + if (test_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags)) > + wait_for_completion(&mcast->done); > + > + /* > + * We have to cancel outside of the spinlock, but we have to > + * take the rtnl lock or else we race with the removal of > + * entries from the remove list in mcast_dev_flush as part > + * of ipoib_stop() which will call mcast_stop_thread with > + * flush == 1 while holding the rtnl lock, and the > + * flush_workqueue won't complete until this restart_mcast_task > + * completes. So do like the carrier on task and attempt to > + * take the rtnl lock, but if we can't before the ADMIN_UP flag > + * goes away, then just return and know that the remove list will > + * get flushed later by mcast_dev_flush. > + */ > + while (!rtnl_trylock()) { > + if (!test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)) > + return; > + else > + msleep(20); > + } > list_for_each_entry_safe(mcast, tmcast, &remove_list, list) { > ipoib_mcast_leave(mcast->dev, mcast); > ipoib_mcast_free(mcast); > } > - > - if (test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)) > - ipoib_mcast_start_thread(dev); > + ipoib_mcast_start_thread(dev); > + rtnl_unlock(); > } > > #ifdef CONFIG_INFINIBAND_IPOIB_DEBUG > -- > 2.1.0 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html