Quoting Tvrtko Ursulin (2020-01-14 11:37:09) > > On 14/01/2020 11:06, Chris Wilson wrote: > > Quoting Chris Wilson (2020-01-14 10:56:47) > >> The rc6 residency starts ticking from 0 from BIOS POST, but the kernel > >> starts measuring the time from its boot. If we start measuruing > >> I915_PMU_RC6_RESIDENCY while the GT is idle, we start our sampling from > >> 0 and then upon first activity (park/unpark) add in all the rc6 > >> residency since boot. After the first park with the sampler engaged, the > >> sleep/active counters are aligned. > >> > >> v2: With a wakeref to be sure > >> > >> Fixes: df6a42053513 ("drm/i915/pmu: Ensure monotonic rc6") > >> Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >> Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >> --- > >> drivers/gpu/drm/i915/i915_pmu.c | 12 ++++++++++++ > >> 1 file changed, 12 insertions(+) > >> > >> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c > >> index 28a82c849bac..ec0299490dd4 100644 > >> --- a/drivers/gpu/drm/i915/i915_pmu.c > >> +++ b/drivers/gpu/drm/i915/i915_pmu.c > >> @@ -637,8 +637,10 @@ static void i915_pmu_enable(struct perf_event *event) > >> container_of(event->pmu, typeof(*i915), pmu.base); > >> unsigned int bit = event_enabled_bit(event); > >> struct i915_pmu *pmu = &i915->pmu; > >> + intel_wakeref_t wakeref; > >> unsigned long flags; > >> > >> + wakeref = intel_runtime_pm_get(&i915->runtime_pm); > > I think it would be nicer to use with_intel_runtime_pm directly at the > __get_rc6 call site. That would show/localise where it is actually needed. We can't, as it gets called under the spinlock :( And I don't see a way around that, as we require the fixup to be applied while idle and so require the wakeref. > >> spin_lock_irqsave(&pmu->lock, flags); > >> > >> /* > >> @@ -648,6 +650,14 @@ static void i915_pmu_enable(struct perf_event *event) > >> BUILD_BUG_ON(ARRAY_SIZE(pmu->enable_count) != I915_PMU_MASK_BITS); > >> GEM_BUG_ON(bit >= ARRAY_SIZE(pmu->enable_count)); > >> GEM_BUG_ON(pmu->enable_count[bit] == ~0); > >> + > >> + if (pmu->enable_count[bit] == 0 && > >> + config_enabled_mask(I915_PMU_RC6_RESIDENCY) & BIT_ULL(bit)) { > >> + pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur = 0; > > > > I can't decide if it's better to have discrete sampling appear > > monotonic, or to reset just in case we drifted far off. > > What do you mean? > > This looks correct to me as you implemented it. On enable it samples the > real RC6 and updates pmu->sleep_last. So regardless if the next even > read comes with device awake or suspended it will report monotonic and > without adding up any time outside the enabled window. u64 sample[2]; fd = perf_open(RC6); sample[0] = read(fd); close(fd); fd = perf_open(RC6); sample[1] = read(fd); close(fd); /* assume idle system */ assert(sample[1] > sample[0]); Do we want that? I don't think that's required by the perf API, as the counters are only valid while the event is enabled (iirc). > Drift can normally come when we overestimate because hw RC6 can be less > than our time between park and unpark. I don't see how to reset that and > stay monotonic. Or you are thinking it doesn't need to be monotonic? I was mostly thinking of bugs, e.g. across suspend. We also have a problem if we wait longer than 2*counter_wrap between perf_event_reads, and we probably should install a very lax timer for rc6 to ensure the __get_rc6() calls remain monotonic. There was a fdo bug for that :) -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx