> -----Original Message----- > From: Leon Romanovsky <leon@xxxxxxxxxx> > Sent: Thursday, October 24, 2019 2:20 PM > To: Jason Gunthorpe <jgg@xxxxxxxx> > Cc: Parav Pandit <parav@xxxxxxxxxxxx>; Doug Ledford > <dledford@xxxxxxxxxx>; RDMA mailing list <linux-rdma@xxxxxxxxxxxxxxx> > Subject: Re: [PATCH rdma-next] IB/core: Avoid deadlock during netlink message > handling > > On Thu, Oct 24, 2019 at 03:36:39PM -0300, Jason Gunthorpe wrote: > > On Thu, Oct 24, 2019 at 06:28:35PM +0000, Parav Pandit wrote: > > > > > > > > > > From: Leon Romanovsky <leon@xxxxxxxxxx> > > > > Sent: Thursday, October 24, 2019 11:13 AM > > > > To: Jason Gunthorpe <jgg@xxxxxxxx> > > > > Cc: Doug Ledford <dledford@xxxxxxxxxx>; Parav Pandit > > > > <parav@xxxxxxxxxxxx>; RDMA mailing list > > > > <linux-rdma@xxxxxxxxxxxxxxx> > > > > Subject: Re: [PATCH rdma-next] IB/core: Avoid deadlock during > > > > netlink message handling > > > > > > > > On Thu, Oct 24, 2019 at 01:08:10PM -0300, Jason Gunthorpe wrote: > > > > > On Thu, Oct 24, 2019 at 07:02:52PM +0300, Leon Romanovsky wrote: > > > > > > On Thu, Oct 24, 2019 at 10:50:17AM -0300, Jason Gunthorpe wrote: > > > > > > > On Thu, Oct 24, 2019 at 04:26:07PM +0300, Leon Romanovsky > wrote: > > > > > > > > On Thu, Oct 24, 2019 at 10:17:43AM -0300, Jason Gunthorpe > wrote: > > > > > > > > > On Tue, Oct 15, 2019 at 11:07:33AM +0300, Leon Romanovsky > wrote: > > > > > > > > > > > > > > > > > > > diff --git a/drivers/infiniband/core/netlink.c > > > > > > > > > > b/drivers/infiniband/core/netlink.c > > > > > > > > > > index 81dbd5f41bed..a3507b8be569 100644 > > > > > > > > > > +++ b/drivers/infiniband/core/netlink.c > > > > > > > > > > @@ -42,9 +42,12 @@ > > > > > > > > > > #include <linux/module.h> #include "core_priv.h" > > > > > > > > > > > > > > > > > > > > -static DEFINE_MUTEX(rdma_nl_mutex); static struct { > > > > > > > > > > - const struct rdma_nl_cbs *cb_table; > > > > > > > > > > + const struct rdma_nl_cbs __rcu *cb_table; > > > > > > > > > > + /* Synchronizes between ongoing netlink commands > and > > > > netlink client > > > > > > > > > > + * unregistration. > > > > > > > > > > + */ > > > > > > > > > > + struct srcu_struct unreg_srcu; > > > > > > > > > > > > > > > > > > A srcu in every index is serious overkill for this. Lets > > > > > > > > > just us a > > > > > > > > > rwsem: > > > > > > > > > > > > > > > > I liked previous variant more than rwsem, but it is Parav's patch. > > > > > > > > > > > > > > Why? srcu is a huge data structure and slow on unregister > > > > > > > > > > > > The unregister time is not so important for those IB/core modules. > > > > > > I liked SRCU because it doesn't have *_ONCE() macros and smb_* > calls. > > > > > > > > > > It does, they are just hidden under other macros.. > > > > > Its better that they are hidden. So that we don't need open code > > > them. > > > > I wouldn't call swapping one function call for another 'open coding' > > > > > Also with srcu, we don't need lock annotations in get_cb_table() > > > which releases and acquires semaphore. > > > > You don't need lock annoations for that. > > > > > Additionally lock nesting makes overall more complex. > > > > SRCU nesting is just as complicated! Don't think SRCU magically hides > > that issue, it is still proposing to nest SRCU read side sections. > > > > > Given that there are only 3 indices, out of which only 2 are outside > > > of the ib_core module and unlikely to be unloaded, I also prefer > > > srcu version. > > > > Why? It isn't faster, it uses more memory, it still has the same > > complex concurrency arrangement.. > > Jason, > > It doesn't worth arguing, both Parav and I prefer SRCU variant, you prefer > rwsem, so go for it, take rwsem, it is not important. > Jason's memory size point made be curious about the srcu_struct size. On my x86 5.x kernel I see srcu_struct costs 70+Kbytes! Likely due to some debug info in my kernel. Which is probably a good reason in this case to shift to rwsem. (rwsem is 80 bytes). One small comment correction needed is, - rdma_nl_types[index].cb_table = cb_table; - mutex_unlock(&rdma_nl_mutex); + /* Pairs with the READ_ONCE in is_nl_valid() */ + smp_store_release(&rdma_nl_types[index].cb_table, cb_table); It should be "Pairs with the READ_ONE in get_cb_table() */ > Thanks > > > > > Jason