RE: [PATCH rdma-next 7/7] RDMA/core: Allow detaching gid attribute netdevice for RoCE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Leon,

> -----Original Message-----
> From: Leon Romanovsky
> Sent: Wednesday, April 10, 2019 4:29 AM
> To: Doug Ledford <dledford@xxxxxxxxxx>; Jason Gunthorpe
> <jgg@xxxxxxxxxxxx>
> Cc: RDMA mailing list <linux-rdma@xxxxxxxxxxxxxxx>; Huy Nguyen
> <huyn@xxxxxxxxxxxx>; Martin Wilck <mwilck@xxxxxxxx>; Parav Pandit
> <parav@xxxxxxxxxxxx>
> Subject: Re: [PATCH rdma-next 7/7] RDMA/core: Allow detaching gid
> attribute netdevice for RoCE
> 
> On Wed, Apr 10, 2019 at 11:15:24AM +0300, Leon Romanovsky wrote:
> > From: Parav Pandit <parav@xxxxxxxxxxxx>
> >
> > When there is active traffic through a GID, a QP/AH holds reference to
> > this GID entry. RoCE GID entry holds reference to its attached
> > netdevice. Due to this when netdevice is deleted by admin user, its
> > refcount is not dropped.
> >
> > Therefore, while deleting RoCE GID, wait for all GID attribute's
> > netdev users to finish accessing netdev in rcu context.
> > Once all users done accessing it, release the netdev refcount.
> >
> > Signed-off-by: Huy Nguyen <huyn@xxxxxxxxxxxx>
> > Signed-off-by: Parav Pandit <parav@xxxxxxxxxxxx>
> > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx>
> > ---
> >  drivers/infiniband/core/cache.c | 75
> > +++++++++++++++++++++++++++------  drivers/infiniband/core/sysfs.c | 13
> ++++--
> >  include/rdma/ib_verbs.h         |  2 +-
> >  3 files changed, 73 insertions(+), 17 deletions(-)
> >
> > diff --git a/drivers/infiniband/core/cache.c
> > b/drivers/infiniband/core/cache.c index d1626502a0d5..0a6f67f5b34f
> > 100644
> > --- a/drivers/infiniband/core/cache.c
> > +++ b/drivers/infiniband/core/cache.c
> > @@ -78,11 +78,22 @@ enum gid_table_entry_state {
> >  	GID_TABLE_ENTRY_PENDING_DEL	= 3,
> >  };
> >
> > +struct roce_gid_ndev_storage {
> > +	struct rcu_head rcu_head;
> > +	struct net_device *ndev;
> > +};
> > +
> >  struct ib_gid_table_entry {
> >  	struct kref			kref;
> >  	struct work_struct		del_work;
> >  	struct ib_gid_attr		attr;
> >  	void				*context;
> > +	/* Store the ndev pointer to release reference later on in
> > +	 * call_rcu context because by that time gid_table_entry
> > +	 * and attr might be already freed. So keep a copy of it.
> > +	 * ndev_storage is freed by rcu callback.
> > +	 */
> > +	struct roce_gid_ndev_storage	*ndev_storage;
> >  	enum gid_table_entry_state	state;
> >  };
> >
> > @@ -206,6 +217,20 @@ static void schedule_free_gid(struct kref *kref)
> >  	queue_work(ib_wq, &entry->del_work);  }
> >
> > +static void put_gid_ndev(struct rcu_head *head) {
> > +	struct roce_gid_ndev_storage *storage =
> > +		container_of(head, struct roce_gid_ndev_storage, rcu_head);
> > +
> > +	WARN_ON(!storage->ndev);
> > +	/* At this point its safe to release netdev reference,
> > +	 * as all callers working on gid_attr->ndev are done
> > +	 * using this netdev.
> > +	 */
> > +	dev_put(storage->ndev);
> > +	kfree(storage);
> > +}
> > +
> >  static void free_gid_entry_locked(struct ib_gid_table_entry *entry)
> > {
> >  	struct ib_device *device = entry->attr.device; @@ -228,8 +253,8 @@
> > static void free_gid_entry_locked(struct ib_gid_table_entry *entry)
> >  	/* Now this index is ready to be allocated */
> >  	write_unlock_irq(&table->rwlock);
> >
> > -	if (entry->attr.ndev)
> > -		dev_put(entry->attr.ndev);
> > +	if (entry->ndev_storage)
> > +		call_rcu(&entry->ndev_storage->rcu_head, put_gid_ndev);
> >  	kfree(entry);
> >  }
> >
> > @@ -266,14 +291,27 @@ static struct ib_gid_table_entry *
> > alloc_gid_entry(const struct ib_gid_attr *attr)  {
> >  	struct ib_gid_table_entry *entry;
> > +	struct net_device *ndev;
> >
> >  	entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> >  	if (!entry)
> >  		return NULL;
> > +	/* rcu_dereference is not needed. It is only to avoid
> > +	 * smatch complain.
> > +	 */
> > +	ndev = rcu_dereference(attr->ndev);
> 
> I don't know why I missed it, but it is wrong.
> It should be rcu_access_pointer().
> 
We are dereferencing a pointer using dev_hold() under if (ndev).
rcu_access_pointer() comment snippet section below, 

 * Return the value of the specified RCU-protected pointer, but omit the
 * lockdep checks for being in an RCU read-side critical section.  This is
 * useful when the value of this pointer is accessed, but the pointer is
 * not dereferenced,
   ^^^^^^^^^^^

So we should not use rcu_access_pointer() due to dev_hold?





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux