On Mon, Nov 04, 2024 at 01:41:03PM -0800, Zhanjun Dong wrote: > GuC to host communication is interrupt driven, the handling has 3 > parts: interrupt context, tasklet and request queue worker. > During GuC reset prepare, interrupt is disabled before destroy > contexts steps start. The IRQ and worker are flushed to finish > any outstanding in-progress message handling. But, the tasklet > flush is missing, it might causes 2 race conditions: > 1. Tasklet runs after IRQ flushed, add request to queue after worker > flush started, causes unexpected G2H message request processing, > meanwhile, reset prepare code already get the context destroyed. > This will causes error reported about bad context state. > (https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11349 and > https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12303) > 2. Tasklet runs after intel_guc_submission_reset_prepare, > ct_try_receive_message start to run, while intel_uc_reset_prepare > already finished guc sanitize and set ct->enable to false. This will > causes warning on incorrect ct->enable state. > (https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12439) > > Add the missing tasklet flush to flush all 3 parts. > Tvrtko, Zhanjun has later found out that this patch deserves a fixes and cc-stable tags. I wonder if it would be possible to manually pick this to drm-intel-fixes and while at it add: Fixes: eb5e7da736f3 ("drm/i915/guc: Reset implementation for new GuC interface") Cc: stable@xxxxxxxxxxxxxxx # v6.1+ Thoughts on the inclusion of the tags while cherry-picking for the fixes? If okay, could you please do this since you are in charge of this round of the drm-fixes? The merged commit is: b939a08bc378 ("drm/i915/guc: Flush ct receive tasklet during reset preparation") Thanks, Rodrigo. > Signed-off-by: Zhanjun Dong <zhanjun.dong@xxxxxxxxx> > Reviewed-by: Alan Previn <alan.previn.teres.alexis@xxxxxxxxx> > --- > drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > index 9ede6f240d79..353a9167c9a4 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > @@ -1688,6 +1688,10 @@ void intel_guc_submission_reset_prepare(struct intel_guc *guc) > spin_lock_irq(guc_to_gt(guc)->irq_lock); > spin_unlock_irq(guc_to_gt(guc)->irq_lock); > > + /* Flush tasklet */ > + tasklet_disable(&guc->ct.receive_tasklet); > + tasklet_enable(&guc->ct.receive_tasklet); > + > guc_flush_submissions(guc); > guc_flush_destroyed_contexts(guc); > flush_work(&guc->ct.requests.worker); > -- > 2.34.1 >