On Thu, Sep 06, 2018 at 09:55:19PM -0600, Jason Gunthorpe wrote: > On Wed, Sep 05, 2018 at 05:21:37PM -0600, Jason Gunthorpe wrote: > > > Instead hold on to the actual mm directly inside the umme via mmgrab() > > and mmdrop(), just like mmu_notifiers already does internally. > > I coded up a series to do this, and more: > > https://github.com/jgunthorpe/linux/commits/tgid_removal > > I'll try to test it later, but it is the general idea.. ucontext->tgid > is an abomination and needs to be deleted. > > Have to do some testing on it.. I tried the series with my repro for use-after-free bug in ODP plus reverted commit "50704e039ab1 RDMA/umem: Restore lockdep check while downgrading lock" just to be sure and got the following splat. I have similar lockdep warning without reverting too. [ 109.860433] [ 109.860911] ============================================ [ 109.861475] WARNING: possible recursive locking detected [ 109.862364] 4.19.0-rc2+ #150 Not tainted [ 109.864026] -------------------------------------------- [ 109.871264] a.out/508 is trying to acquire lock: [ 109.873640] 000000000149a260 (&ucontext->umem_rwsem){++++}, at: ib_umem_notifier_release+0x38/0x70 [i] [ 109.891172] [ 109.891172] but task is already holding lock: [ 109.893709] 000000000149a260 (&ucontext->umem_rwsem){++++}, at: ib_umem_odp_release+0x616/0xd10 [ib_c] [ 109.910607] [ 109.910607] other info that might help us debug this: [ 109.921194] Possible unsafe locking scenario: [ 109.921194] [ 109.921893] CPU0 [ 109.921997] ---- [ 109.927359] lock(&ucontext->umem_rwsem); [ 109.932314] lock(&ucontext->umem_rwsem); [ 109.934924] [ 109.934924] *** DEADLOCK *** [ 109.934924] [ 109.945013] May be due to missing lock nesting notation [ 109.945013] [ 109.947434] 4 locks held by a.out/508: [ 109.955403] #0: 000000007536cac0 (&file->ucontext_lock){+.+.}, at: uverbs_destroy_ufile_hw+0x2a2/0x2] [ 109.970150] #1: 00000000f2288191 (&file->hw_destroy_rwsem){++++}, at: uverbs_destroy_ufile_hw+0xb1/0] [ 109.980486] #2: 000000000149a260 (&ucontext->umem_rwsem){++++}, at: ib_umem_odp_release+0x616/0xd10 ] [ 109.981960] #3: 00000000e1547f54 (srcu){....}, at: mmu_notifier_unregister+0x103/0x340 [ 109.993992] [ 109.993992] stack backtrace: [ 109.994369] CPU: 9 PID: 508 Comm: a.out Not tainted 4.19.0-rc2+ #150 [ 110.003865] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS ?-20180531_142017-buildhw-08.phx4 [ 110.015203] Call Trace: [ 110.015351] dump_stack+0xf0/0x19b [ 110.015864] ? show_regs_print_info+0x5/0x5 [ 110.016027] ? print_lock+0x39/0x81 [ 110.016507] __lock_acquire+0xa97/0x2130 [ 110.016947] ? mark_held_locks+0xa0/0xa0 [ 110.026221] ? __lock_acquire+0x6e9/0x2130 [ 110.026397] ? save_trace+0x106/0x1c0 [ 110.026564] ? mark_held_locks+0xa0/0xa0 [ 110.027010] ? __lock_acquire+0x6e9/0x2130 [ 110.029221] ? mark_held_locks+0xa0/0xa0 [ 110.037254] ? print_irqtrace_events+0x110/0x110 [ 110.037612] ? lock_release+0x780/0x780 [ 110.048803] ? pvclock_read_flags+0x50/0x50 [ 110.049549] ? ib_umem_odp_unmap_dma_pages+0x13d/0x480 [ib_core] [ 110.049978] ? ___might_sleep+0x11d/0x330 [ 110.050484] ? kvm_sched_clock_read+0x14/0x30 [ 110.062444] ? sched_clock_cpu+0xb2/0x220 [ 110.062615] Failed to register mmu_notifier -4 Thanks > > Jason
Attachment:
signature.asc
Description: PGP signature