Re: [PATCH 1/1] drm/i915: Fix ref->mutex deadlock in i915_active_wait()

Sultan Alsawaf <sultan@xxxxxxxxxxxxxxx> · Tue, 14 Apr 2020 07:52:13 -0700

On Tue, Apr 14, 2020 at 09:13:28AM +0100, Chris Wilson wrote:
> Quoting Sultan Alsawaf (2020-04-07 07:26:22)
> > From: Sultan Alsawaf <sultan@xxxxxxxxxxxxxxx>
> > 
> > The following deadlock exists in i915_active_wait() due to a double lock
> > on ref->mutex (call chain listed in order from top to bottom):
> >  i915_active_wait();
> >  mutex_lock_interruptible(&ref->mutex); <-- ref->mutex first acquired
> >  i915_active_request_retire();
> >  node_retire();
> >  active_retire();
> >  mutex_lock_nested(&ref->mutex, SINGLE_DEPTH_NESTING); <-- DEADLOCK
> > 
> > Fix the deadlock by skipping the second ref->mutex lock when
> > active_retire() is called through i915_active_request_retire().
> > 
> > Fixes: 12c255b5dad1 ("drm/i915: Provide an i915_active.acquire callback")
> > Cc: <stable@xxxxxxxxxxxxxxx> # 5.4.x
> > Signed-off-by: Sultan Alsawaf <sultan@xxxxxxxxxxxxxxx>
> 
> Incorrect. 
> 
> You missed that it cannot retire from inside the wait due to the active
> reference held on the i915_active for the wait.
> 
> The only point it can enter retire from inside i915_active_wait() is via
> the terminal __active_retire() which releases the mutex in doing so.
> -Chris

The terminal __active_retire() and rbtree_postorder_for_each_entry_safe() loop
retire different objects, so this isn't true.

Sultan
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx