On Tue, Aug 22, 2023 at 02:14:28PM +0000, Dong, Zhanjun wrote: > > > > -----Original Message----- > > From: Daniel Vetter <daniel@xxxxxxxx> > > Sent: August 22, 2023 9:51 AM > > To: Dong, Zhanjun <zhanjun.dong@xxxxxxxxx> > > Cc: intel-gfx@xxxxxxxxxxxxxxxxxxxxx; dri-devel@xxxxxxxxxxxxxxxxxxxxx; Harrison, > > John C <john.c.harrison@xxxxxxxxx>; Andi Shyti <andi.shyti@xxxxxxxxxxxxxxx>; > > Daniel Vetter <daniel@xxxxxxxx> > > Subject: Re: [PATCH v5] drm/i915: Avoid circular locking dependency when > > flush delayed work on gt reset > > > > On Fri, Aug 11, 2023 at 11:20:11AM -0700, Zhanjun Dong wrote: > > > This attempts to avoid circular locking dependency between flush delayed > > > work and intel_gt_reset. > > > When intel_gt_reset was called, task will hold a lock. > > > To cacel delayed work here, the _sync version will also acquire a lock, > > > which might trigger the possible cirular locking dependency warning. > > > When intel_gt_reset called, reset_in_progress flag will be set, add code > > > to check the flag, call async verion if reset is in progress. > > > > > > Signed-off-by: Zhanjun Dong <zhanjun.dong@xxxxxxxxx> > > > Cc: John Harrison <John.C.Harrison@xxxxxxxxx> > > > Cc: Andi Shyti <andi.shyti@xxxxxxxxxxxxxxx> > > > Cc: Daniel Vetter <daniel@xxxxxxxx> > > > --- > > > drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 11 ++++++++++- > > > 1 file changed, 10 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > > index a0e3ef1c65d2..600388c849f7 100644 > > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > > @@ -1359,7 +1359,16 @@ static void guc_enable_busyness_worker(struct > > intel_guc *guc) > > > > > > static void guc_cancel_busyness_worker(struct intel_guc *guc) > > > { > > > - cancel_delayed_work_sync(&guc->timestamp.work); > > > + /* > > > + * When intel_gt_reset was called, task will hold a lock. > > > + * To cacel delayed work here, the _sync version will also acquire a lock, > > which might > > > + * trigger the possible cirular locking dependency warning. > > > > This is not even close to a locking bugfix. Consider this a formal nack, > > because the issue here is not even close to "needs more comments to > > explain what's going on". > > -Daniel > > The purpose of the comment here it is to explain locking issue condition > > > > > + * Check the reset_in_progress flag, call async verion if reset is in > > progress. > > > The comment here explains check with the flag to avoid locking condition. > The reset process is not considered to be complete in short time, other than that, do we missed anything? Either the _sync is not needed at all, in case you need to explain why. Which this patch doesn't. And if the _sync isn't needed, then it's probably not needed in all/most cases? Or the _sync is needed, and in that case you just replace a potential deadlock scenario with a potential race condition. In neither case should this patch here be merged. -Daniel > > > > + */ > > > + if (guc_to_gt(guc)->uc.reset_in_progress) > > > + cancel_delayed_work(&guc->timestamp.work); > > > + else > > > + cancel_delayed_work_sync(&guc->timestamp.work); > > > } > > > > > > static void __reset_guc_busyness_stats(struct intel_guc *guc) > > > -- > > > 2.34.1 > > > > > > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch