RE: [PATCH rdma-next 01/11] RDMA/core Introduce and use rdma_find_ndev_for_src_ip_rcu

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Jason Gunthorpe
> Sent: Wednesday, September 5, 2018 4:52 PM
> To: Ruhl, Michael J <michael.j.ruhl@xxxxxxxxx>
> Cc: Leon Romanovsky <leon@xxxxxxxxxx>; Doug Ledford
> <dledford@xxxxxxxxxx>; Leon Romanovsky <leonro@xxxxxxxxxxxx>; RDMA
> mailing list <linux-rdma@xxxxxxxxxxxxxxx>; Daniel Jurgens
> <danielj@xxxxxxxxxxxx>; Parav Pandit <parav@xxxxxxxxxxxx>; Yuval Shaia
> <yuval.shaia@xxxxxxxxxx>
> Subject: Re: [PATCH rdma-next 01/11] RDMA/core Introduce and use
> rdma_find_ndev_for_src_ip_rcu
> 
> On Wed, Sep 05, 2018 at 09:36:43PM +0000, Ruhl, Michael J wrote:
> > >From: linux-rdma-owner@xxxxxxxxxxxxxxx [mailto:linux-rdma-
> > >owner@xxxxxxxxxxxxxxx] On Behalf Of Leon Romanovsky
> > >Sent: Wednesday, September 5, 2018 5:54 AM
> > >To: Doug Ledford <dledford@xxxxxxxxxx>; Jason Gunthorpe
> > ><jgg@xxxxxxxxxxxx>
> > >Cc: Leon Romanovsky <leonro@xxxxxxxxxxxx>; RDMA mailing list <linux-
> > >rdma@xxxxxxxxxxxxxxx>; Daniel Jurgens <danielj@xxxxxxxxxxxx>; Parav
> > >Pandit <parav@xxxxxxxxxxxx>; Yuval Shaia <yuval.shaia@xxxxxxxxxx>
> > >Subject: [PATCH rdma-next 01/11] RDMA/core Introduce and use
> > >rdma_find_ndev_for_src_ip_rcu
> > >
> > >From: Parav Pandit <parav@xxxxxxxxxxxx>
> > >
> > >Commit in fixes tag introduced two issues.
> > >1. When address family is other than IPv4 or v6, rdma_translate_ip()
> > >returns success which is incorrect.
> > >2. When address familty is AF_INET6, and if the source address is not
> > >found, it returns success, which is also incorrect.
> > >
> > >Therefore, introduce and use rdma_find_ndev_for_src_ip_rcu() helper
> > >function which returns correct success or error status and is also
> > >useful for future code refactor in addr_resolve().
> > >
> > >Fixes: e08ce2e82b2f ("RDMA/core: Make function rdma_copy_addr return
> > >void")
> > >Signed-off-by: Parav Pandit <parav@xxxxxxxxxxxx>
> > >Reviewed-by: Daniel Jurgens <danielj@xxxxxxxxxxxx>
> > >Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx>
> > >drivers/infiniband/core/addr.c | 61
> > >++++++++++++++++++++++++-------------
> > > 1 file changed, 35 insertions(+), 26 deletions(-)
> > >
> > >diff --git a/drivers/infiniband/core/addr.c
> > >b/drivers/infiniband/core/addr.c index 94ff38731be8..50ab50f1908b
> > >100644
> > >+++ b/drivers/infiniband/core/addr.c
> > >@@ -232,6 +232,36 @@ void rdma_copy_addr(struct rdma_dev_addr
> > >*dev_addr,  }  EXPORT_SYMBOL(rdma_copy_addr);
> > >
> > >+static struct net_device *
> > >+rdma_find_ndev_for_src_ip_rcu(struct net *net, const struct sockaddr
> > >*src_in)
> > >+{
> > >+	struct net_device *dev = NULL;
> > >+	int ret = -EADDRNOTAVAIL;
> > >+
> > >+	switch (src_in->sa_family) {
> > >+	case AF_INET:
> > >+		dev = __ip_dev_find(net,
> > >+				    ((const struct sockaddr_in *)src_in)-
> > >>sin_addr.s_addr,
> > >+				    false);
> >
> > __ip_dev_find() does an rcu_read_lock()/rcu_read_unlock() sequence.
> >
> > Is taking it before calling  rdma_find_ndev_for_src_ip_rcu() the right thing to
> do?
> 
> rcu is safe to nest..
> 
> But I can't quite remember why we needed to hold the rcu there across
> rdma_copy_addr, as we do have the dev_hold at that point..  Maybe that use of
> RCU is overly pessimistic?
> 
> Parav?
> 
I was replying but will continue in this thread now.

> Is taking it before calling  rdma_find_ndev_for_src_ip_rcu() the right thing to do?
Likely because rcu_read_lock can be nested as per [1].
If we do rcu_read_lock() inside rdma_find_ndev_for_src_ip_rcu(), we need to dev_hold and dev_put() and handle code differently between ipv4 and v6.
However, dev_hold() and dev_put() doesn't guarantee that netdev fields won't change such as ifindex. It just ensures that netdev exist.

So assuming rcu_read_lock is inside rdma_find_ndev_for_src_ip_rcu, 
rdma_copy_src_l2_addr() again need to take rcu lock unlock while copying ifindex and in between netdev is free to change things.

So while using nested rcu seems sub-optimal, it does offers benefit 
(a) to avoid such racy conditions (though very corner cases), 
(b) common code for ipv4, v6,
(c) avoids dev_hold, dev_put.

[1] https://www.kernel.org/doc/Documentation/RCU/whatisRCU.txt
Snippet: 
It is possible to nest rcu_read_lock(), since reader-writer locks may
be recursively acquired.




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux