On Tue, Jan 17, 2023 at 01:36:26PM -0800, John.C.Harrison@xxxxxxxxx wrote: > From: John Harrison <John.C.Harrison@xxxxxxxxx> > > When GuC support was added to error capture, the locking around the > request object was broken. Fix it up. > > The context based search manages the spinlocking around the search > internally. So it needs to grab the reference count internally as > well. The execlist only request based search relies on external > locking, so it needs an external reference count. So no change to that > code itself but the context version does change. > > The only other caller is the code for dumping engine state to debugfs. > That code wasn't previously getting an explicit reference at all as it > does everything while holding the execlist specific spinlock. So that > needs updaing as well as that spinlock doesn't help when using GuC > submission. Rather than trying to conditionally get/put depending on > submission model, just change it to always do the get/put. > > In addition, intel_guc_find_hung_context() was not acquiring the > correct spinlock before searching the request list. So fix that up too. > Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset > with GuC") Must be one line. > Fixes: 573ba126aef3 ("drm/i915/guc: Capture error state on context reset") > Cc: Matthew Brost <matthew.brost@xxxxxxxxx> > Cc: John Harrison <John.C.Harrison@xxxxxxxxx> > Cc: Jani Nikula <jani.nikula@xxxxxxxxxxxxxxx> > Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> > Cc: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx> > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@xxxxxxxxx> > Cc: Andrzej Hajda <andrzej.hajda@xxxxxxxxx> > Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: Matthew Auld <matthew.auld@xxxxxxxxx> > Cc: Matt Roper <matthew.d.roper@xxxxxxxxx> > Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@xxxxxxxxx> > Cc: Michael Cheng <michael.cheng@xxxxxxxxx> > Cc: Lucas De Marchi <lucas.demarchi@xxxxxxxxx> > Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@xxxxxxxxx> > Cc: Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx> > Cc: Aravind Iddamsetty <aravind.iddamsetty@xxxxxxxxx> > Cc: Alan Previn <alan.previn.teres.alexis@xxxxxxxxx> > Cc: Bruce Chang <yu.bruce.chang@xxxxxxxxx> > Cc: intel-gfx@xxxxxxxxxxxxxxxxxxxxx Is it possible to utilize --to --cc parameters to git send-email instead of noisy Cc list? ... > + if (hung_rq) > + i915_request_put(hung_rq); In Linux kernel the idiom is that freeing resources APIs should be NULL-aware (or ERR_PTR aware or both). Does i915 follows that? If so, the test should be inside i915_request_put() rather than in any of the callers. ... > @@ -4847,6 +4857,7 @@ void intel_guc_find_hung_context(struct intel_engine_cs *engine) > xa_lock(&guc->context_lookup); > goto done; > } > + > next: > intel_context_put(ce); > xa_lock(&guc->context_lookup); Stray change. -- With Best Regards, Andy Shevchenko