On Mon, Mar 26, 2018 at 09:08:33PM +0100, Chris Wilson wrote: > Quoting Patchwork (2018-03-26 17:53:44) > > Test gem_userptr_blits: > > Subgroup coherency-unsync: > > pass -> INCOMPLETE (shard-hsw) > > Forgot that obj->userptr.mn may not exist. > > > Subgroup dmabuf-sync: > > pass -> DMESG-WARN (shard-hsw) > > But this is the tricky lockdep one, warning of the recursion from gup > into mmu_invalidate_range, i.e. > > down_read(&i915_mmu_notifier->sem); > down_read(&mm_struct->mmap_sem); > gup(); > down_write(&i915_mmut_notifier->sem); > > That seems a genuine deadlock... So I wonder how we managed to get a > lockdep splat and not a dead machine. Maybe gup never triggers the > recursion for our set of flags? Hmm. Coffee starting to kick in. If we gup a range it's likely the mm won't kick out the same range, but something else. I guess we'd need a really huge userptr bo which can't fit into core completely to actually have a reliably chance at triggering this. Would probably deadlock the box :-/ I think Jerome's recommendation is the sequence counter stuff from kvm, plus retrying forever on the gup side. That would convert the same deadlock into a livelock, but well can't have it all :-) And I think once you've killed the task the gup worker hopefully realizes it's wasting time and gives up. For the kvm stuff: Look at #intel-gfx scrollback, we discussed all the necessary bits. Plus Jerome showed some new helpers that would avoid the hand-rolling. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx