Hi John, On Fri, Mar 29, 2024 at 04:53:05PM -0700, John.C.Harrison@xxxxxxxxx wrote: > From: John Harrison <John.C.Harrison@xxxxxxxxx> > > The previous fix for the circlular lock splat about the busyness > worker wasn't quite complete. Even though the reset-in-progress flag > is cleared at the start of intel_uc_reset_finish, the entire function > is still inside the reset mutex lock. Not sure why the patch appeared > to fix the issue both locally and in CI. However, it is now back > again. > > There is a further complication the wedge code path within > intel_gt_reset() jumps around so much it results in nested > reset_prepare/_finish calls. That is, the call sequence is: > intel_gt_reset > | reset_prepare > | __intel_gt_set_wedged > | | reset_prepare > | | reset_finish > | reset_finish > > The nested finish means that even if the clear of the in-progress flag > was moved to the end of _finish, it would still be clear for the > entire second call. Surprisingly, this does not seem to be causing any > other problems at present. > > As an aside, a wedge on fini does not call the finish functions at > all. The reset_in_progress flag is left set (twice). > > So instead of trying to cancel the worker anywhere at all in the reset > path, just add a cancel to intel_guc_submission_fini instead. Note > that it is not a problem if the worker is still active during a reset. > Either it will run before the reset path starts locking things and > will simply block the reset code for a tiny amount of time. Or it will > run after the locks have been acquired and will early exit due to the > try-lock. > > Also, do not use the reset-in-progress flag to decide whether a > synchronous cancel is safe (from a lockdep perspective) or not. > Instead, use the actual reset mutex state (both the genuine one and > the custom rolled BACKOFF one). > > Fixes: 0e00a8814eec ("drm/i915/guc: Avoid circular locking issue on busyness flush") > Signed-off-by: John Harrison <John.C.Harrison@xxxxxxxxx> > Cc: Zhanjun Dong <zhanjun.dong@xxxxxxxxx> > Cc: John Harrison <John.C.Harrison@xxxxxxxxx> > Cc: Andi Shyti <andi.shyti@xxxxxxxxxxxxxxx> > Cc: Daniel Vetter <daniel@xxxxxxxx> > Cc: Daniel Vetter <daniel.vetter@xxxxxxxx> > Cc: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx> > Cc: Nirmoy Das <nirmoy.das@xxxxxxxxx> > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@xxxxxxxxx> > Cc: Andrzej Hajda <andrzej.hajda@xxxxxxxxx> > Cc: Matt Roper <matthew.d.roper@xxxxxxxxx> > Cc: Jonathan Cavitt <jonathan.cavitt@xxxxxxxxx> > Cc: Prathap Kumar Valsan <prathap.kumar.valsan@xxxxxxxxx> > Cc: Alan Previn <alan.previn.teres.alexis@xxxxxxxxx> > Cc: Madhumitha Tolakanahalli Pradeep <madhumitha.tolakanahalli.pradeep@xxxxxxxxx> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@xxxxxxxxx> > Cc: Ashutosh Dixit <ashutosh.dixit@xxxxxxxxx> > Cc: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@xxxxxxxxx> Looks good: Reviewed-by: Andi Shyti <andi.shyti@xxxxxxxxxxxxxxx> Thanks, Andi