Quoting Tvrtko Ursulin (2018-02-07 09:20:27) > > On 06/02/2018 21:54, Imre Deak wrote: > > Hi Rafael, > > > > On Tue, Feb 06, 2018 at 09:11:02PM +0000, Chris Wilson wrote: > >> Quoting Tvrtko Ursulin (2018-02-06 18:33:11) > >>> From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >>> > >>> We are not allowed to call intel_runtime_pm_get from the PMU counter read > >>> callback since the former can sleep, and the latter is running under IRQ > >>> context. > >>> > >>> To workaround this, we record the last known RC6 and while runtime > >>> suspended estimate its increase by querying the runtime PM core > >>> timestamps. > >>> > >>> Downside of this approach is that we can temporarily lose a chunk of RC6 > >>> time, from the last PMU read-out to runtime suspend entry, but that will > >>> eventually catch up, once device comes back online and in the presence of > >>> PMU queries. > >>> > >>> Also, we have to be careful not to overshoot the RC6 estimate, so once > >>> resumed after a period of approximation, we only update the counter once > >>> it catches up. With the observation that RC6 is increasing while the > >>> device is suspended, this should not pose a problem and can only cause > >>> slight inaccuracies due clock base differences. > >>> > >>> v2: Simplify by estimating on top of PM core counters. (Imre) > >>> > >>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104943 > >>> Fixes: 6060b6aec03c ("drm/i915/pmu: Add RC6 residency metrics") > >>> Testcase: igt/perf_pmu/rc6-runtime-pm > >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >>> Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >>> Cc: Imre Deak <imre.deak@xxxxxxxxx> > >>> Cc: Jani Nikula <jani.nikula@xxxxxxxxxxxxxxx> > >>> Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> > >>> Cc: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx> > >>> Cc: David Airlie <airlied@xxxxxxxx> > >>> Cc: intel-gfx@xxxxxxxxxxxxxxxxxxxxx > >>> Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx > >>> --- > >>> drivers/gpu/drm/i915/i915_pmu.c | 93 ++++++++++++++++++++++++++++++++++------- > >>> drivers/gpu/drm/i915/i915_pmu.h | 6 +++ > >>> 2 files changed, 84 insertions(+), 15 deletions(-) > >>> > >>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c > >>> index 1c440460255d..bfc402d47609 100644 > >>> --- a/drivers/gpu/drm/i915/i915_pmu.c > >>> +++ b/drivers/gpu/drm/i915/i915_pmu.c > >>> @@ -415,7 +415,81 @@ static int i915_pmu_event_init(struct perf_event *event) > >>> return 0; > >>> } > >>> > >>> -static u64 __i915_pmu_event_read(struct perf_event *event) > >>> +static u64 get_rc6(struct drm_i915_private *i915, bool locked) > >>> +{ > >>> + unsigned long flags; > >>> + u64 val; > >>> + > >>> + if (intel_runtime_pm_get_if_in_use(i915)) { > >>> + val = intel_rc6_residency_ns(i915, IS_VALLEYVIEW(i915) ? > >>> + VLV_GT_RENDER_RC6 : > >>> + GEN6_GT_GFX_RC6); > >>> + > >>> + if (HAS_RC6p(i915)) > >>> + val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6p); > >>> + > >>> + if (HAS_RC6pp(i915)) > >>> + val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6pp); > >>> + > >>> + intel_runtime_pm_put(i915); > >>> + > >>> + /* > >>> + * If we are coming back from being runtime suspended we must > >>> + * be careful not to report a larger value than returned > >>> + * previously. > >>> + */ > >>> + > >>> + if (!locked) > >>> + spin_lock_irqsave(&i915->pmu.lock, flags); > >>> + > >>> + if (val >= i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur) { > >>> + i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur = 0; > >>> + i915->pmu.sample[__I915_SAMPLE_RC6].cur = val; > >>> + } else { > >>> + val = i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur; > >>> + } > >>> + > >>> + if (!locked) > >>> + spin_unlock_irqrestore(&i915->pmu.lock, flags); > >>> + } else { > >>> + struct pci_dev *pdev = i915->drm.pdev; > >>> + struct device *kdev = &pdev->dev; > >>> + unsigned long flags2; > >>> + > >>> + /* > >>> + * We are runtime suspended. > >>> + * > >>> + * Report the delta from when the device was suspended to now, > >>> + * on top of the last known real value, as the approximated RC6 > >>> + * counter value. > >>> + */ > >>> + if (!locked) > >>> + spin_lock_irqsave(&i915->pmu.lock, flags); > >>> + > >>> + spin_lock_irqsave(&kdev->power.lock, flags2); > >>> + > >>> + if (!i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur) > >>> + i915->pmu.suspended_jiffies_last = > >>> + kdev->power.suspended_jiffies; > >>> + > >>> + val = kdev->power.suspended_jiffies - > >>> + i915->pmu.suspended_jiffies_last; > >>> + val += jiffies - kdev->power.accounting_timestamp; > >>> + > >>> + spin_unlock_irqrestore(&kdev->power.lock, flags2); > >>> + > >>> + val = jiffies_to_nsecs(val); > >>> + val += i915->pmu.sample[__I915_SAMPLE_RC6].cur; > >>> + i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur = val; > >>> + > >>> + if (!locked) > >>> + spin_unlock_irqrestore(&i915->pmu.lock, flags); > >>> + } > >>> + > >>> + return val; > >>> +} > >> > >> I feel slightly dirty, but the dance checks out. > > > > Would it be possible to add an RPM helper that provides the device's > > runtime suspend residency for the above purpose? This would be > > essentially what rtpm_suspended_time_show() provides. > > That would indeed be much better since fishing into internals like the > above is not very nice. > > However, it would also be good not to delay this fix for too long by > additional logistics, and keep it self-contained - easy to backport. Ok, I've failed to find any suggestion that is an improvement on the above. (Except suspended_jiffies_last needs only the pmu.lock, and then move the kdev->power interaction to its own function, say device_suspended_since()? But can't solve the locked? recursion.) Reviewed-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx