Re: [RFC 0/3] Engine utilization tracking

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Wed, 10 May 2017 09:38:24 +0100

On 09/05/2017 19:11, Dmitry Rogozhkin wrote:
On 5/9/2017 8:51 AM, Tvrtko Ursulin wrote:
On 09/05/2017 16:29, Chris Wilson wrote:
On Tue, May 09, 2017 at 04:16:41PM +0100, Tvrtko Ursulin wrote:

On 09/05/2017 15:26, Chris Wilson wrote:
On Tue, May 09, 2017 at 03:09:33PM +0100, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

By popular customer demand here is the prototype for cheap engine
utilization
tracking.

customer and debugfs?

Well I did write in one of the following paragraphs on this topic.
Perhaps I should have put it in procfs. :) Sysfs API looks
restrictive or perhaps I missed a way to get low level (fops) access
to it.

It uses static branches so in the default off case it really
should be cheap.

Not as cheap (for the off case) as simply sampling RING_HEAD/RING_TAIL

Off case are three no-op instructions in three places in the irq
tasklet. And a little bit of object size growth, if you worry about
that aspect?

It's just how the snowball begins.

We should be able to control it. We also have to consider which one is
lighter for this particular use case.

which looks to be the same level of detail. I wrapped all this up in a
perf interface once up a time...

How does that work? Via periodic sampling? Accuracy sounds like it
would be proportionate to the sampling frequency, no?

Right, and the sampling frequency is under user control (via perf) with
a default of around 1000, gives a small systematic error when dealing
with %

I included power, interrupts, rc6, frequency (and the statistics but I
never used those and dropped them once oa landed), as well as
utilisation, just for the convenience of having sane interface :)

Can you resurrect those patches? Don't have to rebase and all but I
would like to see them at least.
Mind that the idea behind the requested kind of stats is primary usage
by the customers in the _product_ environment to track GPU occupancy and
predict based on this stats whether they can execute something else.
Which means that 1) debugfs and any kind of debug-like infrastructure is

Yeah I acknowledged in the cover letter debugfs is not ideal.

I could implement it in sysfs I suppose by doing time based transitions 
as opposed to having explicit open/release hooks. It wouldn't make a 
fundamental different to this RFC from the overhead point of view.

But most importantly we need to see in detail how does Chris' perf based 
idea looks like and does it fit your requirements.

really a no-option, 2) any kind of restrictions are no-option (like
disable RC6 states). Also, there is no need to expose low-level detailed
information like how many EUs and VMEs were in use - this belongs to the
debug things. As for now i915 driver exposes only single required
metric: gt_act_freq_mhz.

I suppose it doesn't matter if the perf based solution (or any really) 
exports more than what you want/need since it is such that you can 
select the events you are interested in.

But the overhead and accuracy of both solutions, plus some other 
considerations like maintainability, need to be looked at.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx