Quoting Tvrtko Ursulin (2018-05-30 11:57:39) > > On 25/05/2018 18:45, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2018-05-25 18:31:35) > >> > >> On 25/05/2018 18:11, Chris Wilson wrote: > >>> hrtimer is not reliable enough to assume fixed intervals, and so even > >>> coarse accuracy (in the face of kasan and similar heavy debugging) we > >>> need to measure the actual interval between sample. > >> > >> It doesn't even average out to something acceptable under such Kconfigs? > >> Horror.. precise but inaccurate. /O\ > >> > >>> While using a single timestamp to compute the interval does not allow > >>> very fine accuracy (consider the impact of a slow forcewake between > >>> different samples after the timestamp is read) is much better than > >>> assuming the interval. > >>> > >>> Testcase: igt/perf_pmu #ivb > >>> Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >>> --- > >>> drivers/gpu/drm/i915/i915_pmu.c | 20 +++++++++++++------- > >>> drivers/gpu/drm/i915/i915_pmu.h | 4 ++++ > >>> 2 files changed, 17 insertions(+), 7 deletions(-) > >>> > >>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c > >>> index dc87797db500..f5087515eb43 100644 > >>> --- a/drivers/gpu/drm/i915/i915_pmu.c > >>> +++ b/drivers/gpu/drm/i915/i915_pmu.c > >>> @@ -127,6 +127,7 @@ static void __i915_pmu_maybe_start_timer(struct drm_i915_private *i915) > >>> { > >>> if (!i915->pmu.timer_enabled && pmu_needs_timer(i915, true)) { > >>> i915->pmu.timer_enabled = true; > >>> + i915->pmu.timestamp = ktime_get(); > >>> hrtimer_start_range_ns(&i915->pmu.timer, > >>> ns_to_ktime(PERIOD), 0, > >>> HRTIMER_MODE_REL_PINNED); > >>> @@ -160,7 +161,7 @@ update_sample(struct i915_pmu_sample *sample, u32 unit, u32 val) > >>> sample->cur += mul_u32_u32(val, unit); > >>> } > >>> > >>> -static void engines_sample(struct drm_i915_private *dev_priv) > >>> +static void engines_sample(struct drm_i915_private *dev_priv, u64 period) > >>> { > >>> struct intel_engine_cs *engine; > >>> enum intel_engine_id id; > >>> @@ -183,7 +184,7 @@ static void engines_sample(struct drm_i915_private *dev_priv) > >>> val = !i915_seqno_passed(current_seqno, last_seqno); > >>> > >>> update_sample(&engine->pmu.sample[I915_SAMPLE_BUSY], > >>> - PERIOD, val); > >>> + period, val); > >>> > >>> if (val && (engine->pmu.enable & > >>> (BIT(I915_SAMPLE_WAIT) | BIT(I915_SAMPLE_SEMA)))) { > >>> @@ -195,10 +196,10 @@ static void engines_sample(struct drm_i915_private *dev_priv) > >>> } > >>> > >>> update_sample(&engine->pmu.sample[I915_SAMPLE_WAIT], > >>> - PERIOD, !!(val & RING_WAIT)); > >>> + period, !!(val & RING_WAIT)); > >>> > >>> update_sample(&engine->pmu.sample[I915_SAMPLE_SEMA], > >>> - PERIOD, !!(val & RING_WAIT_SEMAPHORE)); > >>> + period, !!(val & RING_WAIT_SEMAPHORE)); > >>> } > >>> > >>> if (fw) > >>> @@ -207,7 +208,7 @@ static void engines_sample(struct drm_i915_private *dev_priv) > >>> intel_runtime_pm_put(dev_priv); > >>> } > >>> > >>> -static void frequency_sample(struct drm_i915_private *dev_priv) > >>> +static void frequency_sample(struct drm_i915_private *dev_priv, u64 period) > >> > >> Period is unused in this function. > >> > >> But more importantly that leads to a problem. When reading the counter > >> the frequencies accumulator is divided by FREQUENCY define, which is > >> inverse of PERIOD. If the error is big enough to mess up the engines > >> sampling, is it big enough to affect the frequencies as well? > > > > Yes, but fixing up frequencies I left for another patch, because it's > > going to be more involved (having to choose the divider more carefully) > > and would you believe it, but CI only complains about busy sampling ;) > > > > I passed in the period as a reminder. > > It might only remind someone to remove the unused variable so I am not > sure it is worth it. > > >> Improving that would need average frequency between two counter reads. > >> Which looks tricky to shoehorn into the pmu api. Maybe primitive running > >> average would do. > > > > My plan was to expose cycles (Frequency x time) to the user, and then > > they calculate frequency by the comparing their own samples. (Since we > > give them (time, samples) for each pmu read). > > Frequency times which time? On read? Don't exactly follow. :( On sample: sample[FREQ] += period * instantaneous_measurement The (perf_event_read) caller has to compute freq by d_cycles / d_time. i.e. we don't have a frequency sampler, but a cycles sampler. I think this is a big enough change that we'd have to declare new ABI (deprecate the old samplers, add new). [snip] > I am also thinking that with this approach we could start allowing timer > slack, since we are measuring it anyway. It would release back some of > the added cost of time queries. Well, not that they are significant. > Digression anyway. Interesting, yeah. > I want to understand why this is a problem on IVB and what is the > solution for frequency counters. I can only say that from the looks of it, ivb has very poor hrtimer resolution (I wonder if tsc/hpet are being used!). I think ivb was before art? I too do not know quite why ivb is so special, but it is the only one to have consistency failed in all drm-tip shard runs. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx