Re: [PATCH] drm/i915/pmu: Connect engine busyness stats from GuC to pmu

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 07, 2021 at 09:17:34AM +0100, Tvrtko Ursulin wrote:

On 06/10/2021 21:45, Umesh Nerlige Ramappa wrote:
On Wed, Oct 06, 2021 at 10:11:58AM +0100, Tvrtko Ursulin wrote:

[snip]

@@ -762,12 +764,25 @@ submission_disabled(struct intel_guc *guc)
 static void disable_submission(struct intel_guc *guc)
 {
     struct i915_sched_engine * const sched_engine = guc->sched_engine;
+    struct intel_gt *gt = guc_to_gt(guc);
+    struct intel_engine_cs *engine;
+    enum intel_engine_id id;
+    unsigned long flags;
     if (__tasklet_is_enabled(&sched_engine->tasklet)) {
         GEM_BUG_ON(!guc->ct.enabled);
         __tasklet_disable_sync_once(&sched_engine->tasklet);
         sched_engine->tasklet.callback = NULL;
     }
+
+    cancel_delayed_work(&guc->timestamp.work);

I am not sure when disable_submission gets called so a question - could it be important to call cancel_delayed_work_sync here to ensure if the worker was running it had exited before proceeding?

disable_submission is called in the reset_prepare path for uc resets. I see this happening only with busy-hang test which does a global gt reset. The counterpart for this is the guc_init_engine_stats which is called post reset in the path to initialize GuC.

I tried cancel_delayed_work_sync both here and in park. Seems to work fine, so will change the calls to _sync versions.

From park is not allowed to sleep so can't do sync from there. It might have been my question which put you on a wrong path, sorry. Now I think question remains what happens if the ping worker happens to be sampling GuC state as GuC is being reset? Do you need some sort of a lock to protect that, or make sure worker skips if reset in progress?


If ping ran after the actual gt reset, we should be okay. If it ran after we reset prev_total and before gt reset, then we have bad busyness. At the same time, skipping ping risks timestamp overflow. I am thinking skip ping, but update all stats in the reset_prepare path. reset_prepare is running with pm runtime.

On a different note, during reset, we need to store now-start into the total_gt_clks also because we may lose that information in the next pmu query or ping (post reset). Maybe I will store active_clks instead of running in the stats to do that.

Thanks,
Umesh




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux