On Fri, Sep 07, 2018 at 12:14:58PM -0600, Jason Gunthorpe wrote: > On Fri, Sep 07, 2018 at 08:40:10PM +0300, Leon Romanovsky wrote: > > On Thu, Sep 06, 2018 at 09:55:19PM -0600, Jason Gunthorpe wrote: > > > On Wed, Sep 05, 2018 at 05:21:37PM -0600, Jason Gunthorpe wrote: > > > > > > > Instead hold on to the actual mm directly inside the umme via mmgrab() > > > > and mmdrop(), just like mmu_notifiers already does internally. > > > > > > I coded up a series to do this, and more: > > > > > > https://github.com/jgunthorpe/linux/commits/tgid_removal > > > > > > I'll try to test it later, but it is the general idea.. ucontext->tgid > > > is an abomination and needs to be deleted. > > > > > > Have to do some testing on it.. > > > > I tried the series with my repro for use-after-free bug in ODP plus reverted > > commit "50704e039ab1 RDMA/umem: Restore lockdep check while downgrading lock" > > just to be sure and got the following splat. I have similar lockdep > > warning without reverting too. > > Ah, neat, I was just about to look at the ODP flow.. > > Hm.. The fact the old kernel and test doesn't trigger this probably > explains the bug you are chasing. > > ib_umem_odp_release calls mmu_notifier_unregister under the > umem_rwsem. If the notifier is still live unregister immediately calls > ib_umem_notifier_release. The first thing the release function does is > to also grab the rwsem! > > So this was never, ever, working properly. I think the same, to be one the safe side tested official rdma/for-next commit 2c910cb75e1f without any unaccepted patches yet. The same lockdep warning. > > The only way to avoid this splat is to either never call unregister > (woops!) or to only call unregister in cases where the mm has already > been mput (which is not possible to guarentee at this point) We need to ensure that ib_umem_notifier_release is the only one who can access invalidated umem, it will remove the need of "down_read(&context->umem_rwsem);" > > Question is, is something wrong in my patch that a extra mmget() is > floating about (preventing __exit_mmap), or did dropping off the tgid > maddness do it.. > > Jason
Attachment:
signature.asc
Description: PGP signature