Quoting Tvrtko Ursulin (2018-05-25 18:31:35) > > On 25/05/2018 18:11, Chris Wilson wrote: > > hrtimer is not reliable enough to assume fixed intervals, and so even > > coarse accuracy (in the face of kasan and similar heavy debugging) we > > need to measure the actual interval between sample. > > It doesn't even average out to something acceptable under such Kconfigs? > Horror.. precise but inaccurate. /O\ > > > While using a single timestamp to compute the interval does not allow > > very fine accuracy (consider the impact of a slow forcewake between > > different samples after the timestamp is read) is much better than > > assuming the interval. > > > > Testcase: igt/perf_pmu #ivb > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > --- > > drivers/gpu/drm/i915/i915_pmu.c | 20 +++++++++++++------- > > drivers/gpu/drm/i915/i915_pmu.h | 4 ++++ > > 2 files changed, 17 insertions(+), 7 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c > > index dc87797db500..f5087515eb43 100644 > > --- a/drivers/gpu/drm/i915/i915_pmu.c > > +++ b/drivers/gpu/drm/i915/i915_pmu.c > > @@ -127,6 +127,7 @@ static void __i915_pmu_maybe_start_timer(struct drm_i915_private *i915) > > { > > if (!i915->pmu.timer_enabled && pmu_needs_timer(i915, true)) { > > i915->pmu.timer_enabled = true; > > + i915->pmu.timestamp = ktime_get(); > > hrtimer_start_range_ns(&i915->pmu.timer, > > ns_to_ktime(PERIOD), 0, > > HRTIMER_MODE_REL_PINNED); > > @@ -160,7 +161,7 @@ update_sample(struct i915_pmu_sample *sample, u32 unit, u32 val) > > sample->cur += mul_u32_u32(val, unit); > > } > > > > -static void engines_sample(struct drm_i915_private *dev_priv) > > +static void engines_sample(struct drm_i915_private *dev_priv, u64 period) > > { > > struct intel_engine_cs *engine; > > enum intel_engine_id id; > > @@ -183,7 +184,7 @@ static void engines_sample(struct drm_i915_private *dev_priv) > > val = !i915_seqno_passed(current_seqno, last_seqno); > > > > update_sample(&engine->pmu.sample[I915_SAMPLE_BUSY], > > - PERIOD, val); > > + period, val); > > > > if (val && (engine->pmu.enable & > > (BIT(I915_SAMPLE_WAIT) | BIT(I915_SAMPLE_SEMA)))) { > > @@ -195,10 +196,10 @@ static void engines_sample(struct drm_i915_private *dev_priv) > > } > > > > update_sample(&engine->pmu.sample[I915_SAMPLE_WAIT], > > - PERIOD, !!(val & RING_WAIT)); > > + period, !!(val & RING_WAIT)); > > > > update_sample(&engine->pmu.sample[I915_SAMPLE_SEMA], > > - PERIOD, !!(val & RING_WAIT_SEMAPHORE)); > > + period, !!(val & RING_WAIT_SEMAPHORE)); > > } > > > > if (fw) > > @@ -207,7 +208,7 @@ static void engines_sample(struct drm_i915_private *dev_priv) > > intel_runtime_pm_put(dev_priv); > > } > > > > -static void frequency_sample(struct drm_i915_private *dev_priv) > > +static void frequency_sample(struct drm_i915_private *dev_priv, u64 period) > > Period is unused in this function. > > But more importantly that leads to a problem. When reading the counter > the frequencies accumulator is divided by FREQUENCY define, which is > inverse of PERIOD. If the error is big enough to mess up the engines > sampling, is it big enough to affect the frequencies as well? Yes, but fixing up frequencies I left for another patch, because it's going to be more involved (having to choose the divider more carefully) and would you believe it, but CI only complains about busy sampling ;) I passed in the period as a reminder. > Improving that would need average frequency between two counter reads. > Which looks tricky to shoehorn into the pmu api. Maybe primitive running > average would do. My plan was to expose cycles (Frequency x time) to the user, and then they calculate frequency by the comparing their own samples. (Since we give them (time, samples) for each pmu read). > > { > > if (dev_priv->pmu.enable & > > config_enabled_mask(I915_PMU_ACTUAL_FREQUENCY)) { > > @@ -237,12 +238,17 @@ static enum hrtimer_restart i915_sample(struct hrtimer *hrtimer) > > { > > struct drm_i915_private *i915 = > > container_of(hrtimer, struct drm_i915_private, pmu.timer); > > + ktime_t now, period; > > > > if (!READ_ONCE(i915->pmu.timer_enabled)) > > return HRTIMER_NORESTART; > > > > - engines_sample(i915); > > - frequency_sample(i915); > > + now = ktime_get(); > > + period = ktime_sub(now, i915->pmu.timestamp); > > + i915->pmu.timestamp = now; > > + > > + engines_sample(i915, period); > > + frequency_sample(i915, period); > > > > hrtimer_forward_now(hrtimer, ns_to_ktime(PERIOD)); > > return HRTIMER_RESTART; > > diff --git a/drivers/gpu/drm/i915/i915_pmu.h b/drivers/gpu/drm/i915/i915_pmu.h > > index 2ba735299f7c..0f1e4642077e 100644 > > --- a/drivers/gpu/drm/i915/i915_pmu.h > > +++ b/drivers/gpu/drm/i915/i915_pmu.h > > @@ -52,6 +52,10 @@ struct i915_pmu { > > * @timer: Timer for internal i915 PMU sampling. > > */ > > struct hrtimer timer; > > + /** > > + * @timestamp: Timestamp of last internal i915 PMU sampling. > > + */ > > + ktime_t timestamp; > > /** > > * @enable: Bitmask of all currently enabled events. > > * > > > > Patch looks okay, just loses the some of the optimisation potential so I > am guessing we won't be thinking about replacing multiplies and divides > with shift any more. :) > > But the question of frequency counters is now bothering me. > > And if this problem is limited to Kasan then how much we want to > complicate things to make that work? Not just kasan, but ivb really. That's the only to have never worked whatever the config. kasan affects more CI machines. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx