Re: [PATCH] i915/pmu: Fix zero delta busyness issue

Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx> · Wed, 22 Jan 2025 05:12:01 -0500

On Tue, Jan 21, 2025 at 03:58:03PM -0800, Umesh Nerlige Ramappa wrote:
> On Tue, Jan 21, 2025 at 06:10:34PM -0500, Rodrigo Vivi wrote:
> > On Tue, Jan 21, 2025 at 02:25:57PM -0800, Umesh Nerlige Ramappa wrote:
> > 
> > drm/i915/pmu as tag please...
> 
> will do
> 
> > 
> > > When running igt@gem_exec_balancer@individual for multiple iterations,
> > > it is seen that the delta busyness returned by PMU is 0. The issue stems
> > > from a combination of 2 implementation specific details:
> > > 
> > > 1) gt_park is throttling __update_guc_busyness_stats() so that it does
> > > not hog PCI bandwidth for some use cases. (Ref: 59bcdb564b3ba)
> > > 
> > > 2) busyness implementation always returns monotonically increasing
> > > counters. (Ref: cf907f6d29421)
> > > 
> > > If an application queried an engine while it was active,
> > > engine->stats.guc.running is set to true. Following that, if all PM
> > > wakeref's are released, then gt is parked. At this time the throttling
> > > of __update_guc_busyness_stats() may result in a missed update to the
> > > running state of the engine (due to (1) above). This means subsequent
> > > calls to guc_engine_busyness() will think that the engine is still
> > > running and they will keep updating the cached counter (stats->total).
> > > This results in an inflated cached counter.
> > > 
> > > Later when the application runs a workload and queries for busyness, we
> > > return the cached value since it is larger than the actual value (due to
> > > (2) above)
> > > 
> > > All subsequent queries will return the same large (inflated) value, so
> > > the application sees a delta busyness of zero.
> > > 
> > > In order to fix the issue,
> > > - reset the running state of engines in busyness_park outside of the
> > >   throttling code.
> > > - when busyness is queried and PM wakeref is not active, do not
> > >   calculate active engine time since we do not expect engines to be
> > >   active without a wakeref.
> > > 
> > > Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13366
> > > Fixes: cf907f6d2942 ("i915/guc: Ensure busyness counter increases motonically")
> > > Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@xxxxxxxxx>
> > > ---
> > >  .../gpu/drm/i915/gt/uc/intel_guc_submission.c  | 18 +++++++++++++++++-
> > >  1 file changed, 17 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > index 12f1ba7ca9c1..b7a831e1fc85 100644
> > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > @@ -1372,7 +1372,7 @@ static ktime_t guc_engine_busyness(struct intel_engine_cs *engine, ktime_t *now)
> > >  	}
> > > 
> > >  	total = intel_gt_clock_interval_to_ns(gt, stats->total_gt_clks);
> > > -	if (stats->running) {
> > > +	if (wakeref && stats->running) {
> > 
> > do you really need to check this wakeref if you are now setting running to
> > false earlier?
> 
> Maybe not. I did try it without this wakeref and that passes too.
> 
> > 
> > And if we do, what about moving this entire block to inside the
> > existing if (wakeref) ?!
> 
> Yes, looks like I could move it inside the existing wakeref check block, but
> I think I will drop this check since running is being set to false in the
> gt_park code path.
> 
> > 
> > >  		u64 clk = guc->timestamp.gt_stamp - stats->start_gt_clk;
> > > 
> > >  		total += intel_gt_clock_interval_to_ns(gt, clk);
> > > @@ -1469,6 +1469,19 @@ static void __reset_guc_busyness_stats(struct intel_guc *guc)
> > >  	spin_unlock_irqrestore(&guc->timestamp.lock, flags);
> > >  }
> > > 
> > > +static void __update_guc_busyness_running_state(struct intel_guc *guc)
> > > +{
> > > +	struct intel_gt *gt = guc_to_gt(guc);
> > > +	struct intel_engine_cs *engine;
> > > +	enum intel_engine_id id;
> > > +	unsigned long flags;
> > > +
> > > +	spin_lock_irqsave(&guc->timestamp.lock, flags);
> > > +	for_each_engine(engine, gt, id)
> > > +		engine->stats.guc.running = false;
> > > +	spin_unlock_irqrestore(&guc->timestamp.lock, flags);
> > > +}
> > > +
> > >  static void __update_guc_busyness_stats(struct intel_guc *guc)
> > >  {
> > >  	struct intel_gt *gt = guc_to_gt(guc);
> > > @@ -1619,6 +1632,9 @@ void intel_guc_busyness_park(struct intel_gt *gt)
> > >  	if (!guc_submission_initialized(guc))
> > >  		return;
> > > 
> > > +	/* Assume no engines are running and set running state to false */
> > > +	__update_guc_busyness_running_state(guc);
> > > +
> > 
> > Why not to move the entire __reset_guc_busyness_stats earlier then?
> 
> Not sure what you mean by __reset_guc_busyness_stats because that is only
> called when reset is in progress.
> 
> Do you mean __update_guc_busyness_stats()?
> 
> Only the running state needs to be updated for every gt_park. Updates to
> other stats can be throttled based on the jiffies code in
> intel_guc_busyness_park().

ah okay then, please ignore me in this last point...

> 
> Thanks,
> Umesh
> 
> > 
> > >  	/*
> > >  	 * There is a race with suspend flow where the worker runs after suspend
> > >  	 * and causes an unclaimed register access warning. Cancel the worker
> > > --
> > > 2.34.1
> > >