On Wed, 2021-11-17 at 19:49 +0100, Thomas Hellström wrote: > > On 11/17/21 15:20, Matthew Auld wrote: > > In intel_context_do_pin_ww, when calling into the pre_pin > > hook(which is > > passed the ww context) it could in theory return -EDEADLK(which is > > very > > likely with debug kernels), once we start adding more ww locking in > > there, > > like in the next patch. If so then we need to be mindful of having > > to > > restart the do_pin at this point. > > > > If this is the kernel_context, or some other early in-kernel > > context > > where we have yet to setup the default_state, then we always > > inhibit the > > context restore, and instead rely on the delayed active_release to > > set > > the CONTEXT_VALID_BIT for us(if we even care), which should > > indicate > > that we have context switched away, and that our newly saved > > context > > state should now be valid. However, since we currently grab the > > active > > reference before the potential ww dance, we can end up setting the > > CONTEXT_VALID_BIT much too early, if we need to backoff, and then > > upon > > re-trying the do_pin, we could potentially cause the hardware to > > incorrectly load some garbage context state when later context > > switching > > to that context, but at the very least this will trigger the > > GEM_BUG_ON() in __engine_unpark. For now let's just move any ww > > dance > > stuff prior to arming the active reference. > > > > For normal user contexts this shouldn't be a concern, since we > > should > > already have the default_state ready when initialising the lrc > > state, > > and so there should be no concern with active_release somehow > > prematurely setting the CONTEXT_VALID_BIT. > > > > v2(Thomas): > > - Also re-order the union unwind Oh should this be s/union/onion/ ? > > > > Signed-off-by: Matthew Auld <matthew.auld@xxxxxxxxx> > > Cc: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> > > Cc: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxxxx> > > Reviewed-by: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> > > > > --- > > drivers/gpu/drm/i915/gt/intel_context.c | 12 ++++++------ > > 1 file changed, 6 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c > > b/drivers/gpu/drm/i915/gt/intel_context.c > > index 5634d14052bc..4c296de1d67d 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_context.c > > +++ b/drivers/gpu/drm/i915/gt/intel_context.c > > @@ -228,17 +228,17 @@ int __intel_context_do_pin_ww(struct > > intel_context *ce, > > if (err) > > return err; > > > > - err = i915_active_acquire(&ce->active); > > + err = ce->ops->pre_pin(ce, ww, &vaddr); > > if (err) > > goto err_ctx_unpin; > > > > - err = ce->ops->pre_pin(ce, ww, &vaddr); > > + err = i915_active_acquire(&ce->active); > > if (err) > > - goto err_release; > > + goto err_post_unpin; > > > > err = mutex_lock_interruptible(&ce->pin_mutex); > > if (err) > > - goto err_post_unpin; > > + goto err_release; > > > > intel_engine_pm_might_get(ce->engine); > > > > @@ -273,11 +273,11 @@ int __intel_context_do_pin_ww(struct > > intel_context *ce, > > > > err_unlock: > > mutex_unlock(&ce->pin_mutex); > > +err_release: > > + i915_active_release(&ce->active); > > err_post_unpin: > > if (!handoff) > > ce->ops->post_unpin(ce); > > -err_release: > > - i915_active_release(&ce->active); > > err_ctx_unpin: > > intel_context_post_unpin(ce); > >