Quoting Tvrtko Ursulin (2017-07-19 10:12:33) > > On 18/07/2017 16:19, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2017-07-18 15:36:15) > >> From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >> > >> Track total time requests have been executing on the hardware. > >> > >> To make this cheap it is hidden behind a static branch with the > >> intention that it is only enabled when there is a consumer > >> listening. This means that in the default off case the total > >> cost of the tracking is just a few no-op instructions on the > >> fast paths. > > > >> +static inline void intel_engine_context_in(struct intel_engine_cs *engine) > >> +{ > >> + if (static_branch_unlikely(&i915_engine_stats_key)) { > >> + unsigned long flags; > >> + > >> + spin_lock_irqsave(&engine->stats.lock, flags); > > > > What's the purpose of this lock? RMW is ordered by virtue of the tasklet > > (only one cpu can be doing the rmw at any time). Did I miss another > > user? > > Hm it should be a plain spin_lock and a _bh variant on the external API. > > But the purpose is to allow atomic sampling of accumulated busy time > plus the current engine status. You can do that with a lockless read, if you treat it as a seqlock and the total as the latch. Something like u64 total, latch, start; total = engine->stats.total; do { latch = total; start = engine->stats.start; smp_rmb(); total = engine->stats.total; } while (total != latch); total += ktime_now() - latch; Assuming x86-64. If we only have 32b atomic reads, it is less fun. > >> + if (engine->stats.ref++ == 0) > >> + engine->stats.start = ktime_get_real_ns(); > > > > Use ktime_get_raw() and leave the conversion to ns to the caller. > > You mean both store in ktime_t and don't fetch the wall time but > monotonic? Do you say this because perf pmu perhaps needs monotonic or > for some other reason? We only want monotonic (otherwise we will observe time going backwards), but whether or not we want the clock source compensation? I was under the impression we could defer that compensation until later. > > What is the cost of a ktime nowadays? > > I don't see it in the profiles, so not much I think. readtsc is there, but not enough to worry. The effect on execution latency is in the noise on initial inspection. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx