RE: [PATCH rdma-next] IB/core: Avoid deadlock during netlink message handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Leon Romanovsky <leon@xxxxxxxxxx>
> Sent: Thursday, October 24, 2019 2:20 PM
> To: Jason Gunthorpe <jgg@xxxxxxxx>
> Cc: Parav Pandit <parav@xxxxxxxxxxxx>; Doug Ledford
> <dledford@xxxxxxxxxx>; RDMA mailing list <linux-rdma@xxxxxxxxxxxxxxx>
> Subject: Re: [PATCH rdma-next] IB/core: Avoid deadlock during netlink message
> handling
> 
> On Thu, Oct 24, 2019 at 03:36:39PM -0300, Jason Gunthorpe wrote:
> > On Thu, Oct 24, 2019 at 06:28:35PM +0000, Parav Pandit wrote:
> > >
> > >
> > > > From: Leon Romanovsky <leon@xxxxxxxxxx>
> > > > Sent: Thursday, October 24, 2019 11:13 AM
> > > > To: Jason Gunthorpe <jgg@xxxxxxxx>
> > > > Cc: Doug Ledford <dledford@xxxxxxxxxx>; Parav Pandit
> > > > <parav@xxxxxxxxxxxx>; RDMA mailing list
> > > > <linux-rdma@xxxxxxxxxxxxxxx>
> > > > Subject: Re: [PATCH rdma-next] IB/core: Avoid deadlock during
> > > > netlink message handling
> > > >
> > > > On Thu, Oct 24, 2019 at 01:08:10PM -0300, Jason Gunthorpe wrote:
> > > > > On Thu, Oct 24, 2019 at 07:02:52PM +0300, Leon Romanovsky wrote:
> > > > > > On Thu, Oct 24, 2019 at 10:50:17AM -0300, Jason Gunthorpe wrote:
> > > > > > > On Thu, Oct 24, 2019 at 04:26:07PM +0300, Leon Romanovsky
> wrote:
> > > > > > > > On Thu, Oct 24, 2019 at 10:17:43AM -0300, Jason Gunthorpe
> wrote:
> > > > > > > > > On Tue, Oct 15, 2019 at 11:07:33AM +0300, Leon Romanovsky
> wrote:
> > > > > > > > >
> > > > > > > > > > diff --git a/drivers/infiniband/core/netlink.c
> > > > > > > > > > b/drivers/infiniband/core/netlink.c
> > > > > > > > > > index 81dbd5f41bed..a3507b8be569 100644
> > > > > > > > > > +++ b/drivers/infiniband/core/netlink.c
> > > > > > > > > > @@ -42,9 +42,12 @@
> > > > > > > > > >  #include <linux/module.h>  #include "core_priv.h"
> > > > > > > > > >
> > > > > > > > > > -static DEFINE_MUTEX(rdma_nl_mutex);  static struct {
> > > > > > > > > > -	const struct rdma_nl_cbs   *cb_table;
> > > > > > > > > > +	const struct rdma_nl_cbs __rcu *cb_table;
> > > > > > > > > > +	/* Synchronizes between ongoing netlink commands
> and
> > > > netlink client
> > > > > > > > > > +	 * unregistration.
> > > > > > > > > > +	 */
> > > > > > > > > > +	struct srcu_struct unreg_srcu;
> > > > > > > > >
> > > > > > > > > A srcu in every index is serious overkill for this. Lets
> > > > > > > > > just us a
> > > > > > > > > rwsem:
> > > > > > > >
> > > > > > > > I liked previous variant more than rwsem, but it is Parav's patch.
> > > > > > >
> > > > > > > Why? srcu is a huge data structure and slow on unregister
> > > > > >
> > > > > > The unregister time is not so important for those IB/core modules.
> > > > > > I liked SRCU because it doesn't have *_ONCE() macros and smb_*
> calls.
> > > > >
> > > > > It does, they are just hidden under other macros..
> >
> > > Its better that they are hidden. So that we don't need open code
> > > them.
> >
> > I wouldn't call swapping one function call for another 'open coding'
> >
> > > Also with srcu, we don't need lock annotations in get_cb_table()
> > > which releases and acquires semaphore.
> >
> > You don't need lock annoations for that.
> >
> > > Additionally lock nesting makes overall more complex.
> >
> > SRCU nesting is just as complicated! Don't think SRCU magically hides
> > that issue, it is still proposing to nest SRCU read side sections.
> >
> > > Given that there are only 3 indices, out of which only 2 are outside
> > > of the ib_core module and unlikely to be unloaded, I also prefer
> > > srcu version.
> >
> > Why? It isn't faster, it uses more memory, it still has the same
> > complex concurrency arrangement..
> 
> Jason,
> 
> It doesn't worth arguing, both Parav and I prefer SRCU variant, you prefer
> rwsem, so go for it, take rwsem, it is not important.
> 
Jason's memory size point made be curious about the srcu_struct size.
On my x86 5.x kernel I see srcu_struct costs 70+Kbytes! Likely due to some debug info in my kernel.
Which is probably a good reason in this case to shift to rwsem. (rwsem is 80 bytes).

One small comment correction needed is,

-	rdma_nl_types[index].cb_table = cb_table;
-	mutex_unlock(&rdma_nl_mutex);
+	/* Pairs with the READ_ONCE in is_nl_valid() */
+	smp_store_release(&rdma_nl_types[index].cb_table, cb_table);

It should be "Pairs with the READ_ONE in get_cb_table() */

> Thanks
> 
> >
> > Jason




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux