On Fri, Sep 07, 2018 at 08:40:10PM +0300, Leon Romanovsky wrote: > On Thu, Sep 06, 2018 at 09:55:19PM -0600, Jason Gunthorpe wrote: > > On Wed, Sep 05, 2018 at 05:21:37PM -0600, Jason Gunthorpe wrote: > > > > > Instead hold on to the actual mm directly inside the umme via mmgrab() > > > and mmdrop(), just like mmu_notifiers already does internally. > > > > I coded up a series to do this, and more: > > > > https://github.com/jgunthorpe/linux/commits/tgid_removal > > > > I'll try to test it later, but it is the general idea.. ucontext->tgid > > is an abomination and needs to be deleted. > > > > Have to do some testing on it.. > > I tried the series with my repro for use-after-free bug in ODP plus reverted > commit "50704e039ab1 RDMA/umem: Restore lockdep check while downgrading lock" > just to be sure and got the following splat. I have similar lockdep > warning without reverting too. Something changed in my test system, I'm receiving such warning without your patches, with reverted "lockdep" patch and reverted this ODP patch. Thanks > > [ 109.860433] > [ 109.860911] ============================================ > [ 109.861475] WARNING: possible recursive locking detected > [ 109.862364] 4.19.0-rc2+ #150 Not tainted > [ 109.864026] -------------------------------------------- > [ 109.871264] a.out/508 is trying to acquire lock: > [ 109.873640] 000000000149a260 (&ucontext->umem_rwsem){++++}, at: ib_umem_notifier_release+0x38/0x70 [i] > [ 109.891172] > [ 109.891172] but task is already holding lock: > [ 109.893709] 000000000149a260 (&ucontext->umem_rwsem){++++}, at: ib_umem_odp_release+0x616/0xd10 [ib_c] > [ 109.910607] > [ 109.910607] other info that might help us debug this: > [ 109.921194] Possible unsafe locking scenario: > [ 109.921194] > [ 109.921893] CPU0 > [ 109.921997] ---- > [ 109.927359] lock(&ucontext->umem_rwsem); > [ 109.932314] lock(&ucontext->umem_rwsem); > [ 109.934924] > [ 109.934924] *** DEADLOCK *** > [ 109.934924] > [ 109.945013] May be due to missing lock nesting notation > [ 109.945013] > [ 109.947434] 4 locks held by a.out/508: > [ 109.955403] #0: 000000007536cac0 (&file->ucontext_lock){+.+.}, at: uverbs_destroy_ufile_hw+0x2a2/0x2] > [ 109.970150] #1: 00000000f2288191 (&file->hw_destroy_rwsem){++++}, at: uverbs_destroy_ufile_hw+0xb1/0] > [ 109.980486] #2: 000000000149a260 (&ucontext->umem_rwsem){++++}, at: ib_umem_odp_release+0x616/0xd10 ] > [ 109.981960] #3: 00000000e1547f54 (srcu){....}, at: mmu_notifier_unregister+0x103/0x340 > [ 109.993992] > [ 109.993992] stack backtrace: > [ 109.994369] CPU: 9 PID: 508 Comm: a.out Not tainted 4.19.0-rc2+ #150 > [ 110.003865] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS > ?-20180531_142017-buildhw-08.phx4 > [ 110.015203] Call Trace: > [ 110.015351] dump_stack+0xf0/0x19b > [ 110.015864] ? show_regs_print_info+0x5/0x5 > [ 110.016027] ? print_lock+0x39/0x81 > [ 110.016507] __lock_acquire+0xa97/0x2130 > [ 110.016947] ? mark_held_locks+0xa0/0xa0 > [ 110.026221] ? __lock_acquire+0x6e9/0x2130 > [ 110.026397] ? save_trace+0x106/0x1c0 > [ 110.026564] ? mark_held_locks+0xa0/0xa0 > [ 110.027010] ? __lock_acquire+0x6e9/0x2130 > [ 110.029221] ? mark_held_locks+0xa0/0xa0 > [ 110.037254] ? print_irqtrace_events+0x110/0x110 > [ 110.037612] ? lock_release+0x780/0x780 > [ 110.048803] ? pvclock_read_flags+0x50/0x50 > [ 110.049549] ? ib_umem_odp_unmap_dma_pages+0x13d/0x480 [ib_core] > [ 110.049978] ? ___might_sleep+0x11d/0x330 > [ 110.050484] ? kvm_sched_clock_read+0x14/0x30 > [ 110.062444] ? sched_clock_cpu+0xb2/0x220 > [ 110.062615] Failed to register mmu_notifier -4 > > Thanks > > > > > Jason
Attachment:
signature.asc
Description: PGP signature