Quoting Mika Kuoppala (2019-02-12 11:12:05) > Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > > > We cannot nest i915_reset_trylock() as the inner may wait for the > > I915_RESET_BACKOFF which in turn is waiting upon sync_srcu who is > > waiting for our outermost lock. As we take the reset srcu around the > > fence update, we have to defer taking it in i915_gem_fault() until after > > we acquire the pin on the fence to avoid nesting. This is a little ugly, > > but still works. If a reset occurs between i915_vma_pin_fence() and the > > second reset lock, the reset will restore the fence register back to the > > pinned value before the reset lock allows us to proceed (our mmap won't > > be revoked as we haven't yet marked it as being a userfault as that > > requires us to hold the reset lock), so the pagefault is still > > serialised with the revocation in reset. > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109605 > > Fixes: 2caffbf11762 ("drm/i915: Revoke mmaps and prevent access to fence registers across reset") > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxx> > > --- > > drivers/gpu/drm/i915/i915_gem.c | 16 ++++++++-------- > > 1 file changed, 8 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > > index c8c355bec091..ae1467a74a08 100644 > > --- a/drivers/gpu/drm/i915/i915_gem.c > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > @@ -1923,16 +1923,16 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf) > > if (ret) > > goto err_unpin; > > > > + ret = i915_vma_pin_fence(vma); > > + if (ret) > > + goto err_unpin; > > + > > As this is obviusness slipped past us, would it > be worthwhile, in retrospect, to build a debug in > i915_reset_trylock to be vocal about it failing > to make progress? If we stick a timeout in there, we just send that back to userspace. Deadlock resolved just with a sporadic delay. It is interruptible so it's not a complete loss, and more obvious if it stalls? That's my thinking for not sending along the quick conversion to wait_event_interruptible_timeout(). What I think we can do is stick a might_lock() so we get the lockdep splat before the wait? -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx