Quoting Tvrtko Ursulin (2018-06-04 13:52:39) > From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > As Chris has discovered on his Ivybridge, and later automated test runs > have confirmed, on most of our platforms hrtimer faced with heavy GPU load > ca occasionally become sufficiently imprecise to affect PMU sampling s/ca/can/ > calculations. > > This means we cannot assume sampling frequency is what we asked for, but > we need to measure the interval ourselves. > > This patch is similar to Chris' original proposal for per-engine counters, > but instead of introducing a new set to work around the problem with > frequency sampling, it swaps around the way internal frequency accounting > is done. Instead of accumulating current frequency and dividing by > sampling frequency on readout, it accumulates frequency scaled by each > period. My ulterior motive failed to win favour ;) > Testcase: igt/perf_pmu/*busy* # snb, ivb, hsw > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > Suggested-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Reviewed-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > @@ -237,14 +251,21 @@ static enum hrtimer_restart i915_sample(struct hrtimer *hrtimer) > { > struct drm_i915_private *i915 = > container_of(hrtimer, struct drm_i915_private, pmu.timer); > + unsigned int period_ns; > + ktime_t now; > > if (!READ_ONCE(i915->pmu.timer_enabled)) > return HRTIMER_NORESTART; > > - engines_sample(i915); > - frequency_sample(i915); > + now = ktime_get(); > + period_ns = ktime_to_ns(ktime_sub(now, i915->pmu.timer_last)); > + i915->pmu.timer_last = now; Maybe worth a comment about the lag coming from the timer outweighs our concern with using a single timestamp over multiple samplers, with indeterminate delays due to forcewake. > + engines_sample(i915, period_ns); > + frequency_sample(i915, period_ns); > + > + hrtimer_forward(hrtimer, now, ns_to_ktime(PERIOD)); > > - hrtimer_forward_now(hrtimer, ns_to_ktime(PERIOD)); > return HRTIMER_RESTART; > } > > @@ -519,12 +540,12 @@ static u64 __i915_pmu_event_read(struct perf_event *event) > case I915_PMU_ACTUAL_FREQUENCY: > val = > div_u64(i915->pmu.sample[__I915_SAMPLE_FREQ_ACT].cur, > - FREQUENCY); > + 1000000 /* to MHz */); USEC_PER_SECS /* to MHz */ Storage in sample[] is MHz.usec, so here we cancel out the usec to leave MHz. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx