Just some minor nits on header. Otherwise, LGTM: Reviewed-by: Alan Previn <alan.previn.teres.alexis@xxxxxxxxx> On Wed, 2024-10-30 at 15:38 -0700, Zhanjun Dong wrote: > GuC to host communication is interrupt driven, the handling has 3 > parts: interrupt context, tasklet and request queue worker. > During GuC reset prepare, interrupt is disabled before destroy > contexts steps start. The IRQ and worker flushed to finish alan: "and worker are flushed to finish" > in progress message handling if there are. The tasklet flush is alan: "any outstanding in-progress message handling. But, the tasklet flush..." > missing, it might causes 2 race conditions: > 1. Tasklet runs after IRQ flushed, add request to queue after worker > flush started, causes unexpected G2H message request processing, > meanwhile, reset prepare code already get the context destroyed. > This will causes error reported about bad context state. > 2. Tasklet runs after intel_guc_submission_reset_prepare, > ct_try_receive_message start to run, while intel_uc_reset_prepare > already finished guc sanitize and set ct->enable to false. This will > causes warning on incorrect ct->enable state. > > Add the missing tasklet flush to flush all 3 parts. > > Signed-off-by: Zhanjun Dong <zhanjun.dong@xxxxxxxxx> > Cc: John Harrison <John.C.Harrison@xxxxxxxxx> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@xxxxxxxxx> > --- > drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > index 9ede6f240d79..353a9167c9a4 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > @@ -1688,6 +1688,10 @@ void intel_guc_submission_reset_prepare(struct > intel_guc *guc) alan: i still feel like we should be just killing off the guc at this point (via GT_RESTT) before any of the following reset prep sequences. But as per offline conversation, we agreed that might be too intrusive a change for i915 while new design ideas are being concentrated on Xe. > spin_lock_irq(guc_to_gt(guc)->irq_lock); > spin_unlock_irq(guc_to_gt(guc)->irq_lock); > > + /* Flush tasklet */ > + tasklet_disable(&guc->ct.receive_tasklet); > + tasklet_enable(&guc->ct.receive_tasklet); > + > guc_flush_submissions(guc); > guc_flush_destroyed_contexts(guc); > flush_work(&guc->ct.requests.worker);