On Wed, May 10, 2017 at 10:38 AM, Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote: > > On 09/05/2017 19:11, Dmitry Rogozhkin wrote: >> >> On 5/9/2017 8:51 AM, Tvrtko Ursulin wrote: >>> >>> On 09/05/2017 16:29, Chris Wilson wrote: >>>> >>>> On Tue, May 09, 2017 at 04:16:41PM +0100, Tvrtko Ursulin wrote: >>>>> >>>>> >>>>> On 09/05/2017 15:26, Chris Wilson wrote: >>>>>> >>>>>> On Tue, May 09, 2017 at 03:09:33PM +0100, Tvrtko Ursulin wrote: >>>>>>> >>>>>>> From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> >>>>>>> >>>>>>> By popular customer demand here is the prototype for cheap engine >>>>>>> utilization >>>>>>> tracking. >>>>>> >>>>>> >>>>>> customer and debugfs? >>>>> >>>>> >>>>> Well I did write in one of the following paragraphs on this topic. >>>>> Perhaps I should have put it in procfs. :) Sysfs API looks >>>>> restrictive or perhaps I missed a way to get low level (fops) access >>>>> to it. >>>>> >>>>>>> It uses static branches so in the default off case it really >>>>>>> should be cheap. >>>>>> >>>>>> >>>>>> Not as cheap (for the off case) as simply sampling RING_HEAD/RING_TAIL >>>>> >>>>> >>>>> Off case are three no-op instructions in three places in the irq >>>>> tasklet. And a little bit of object size growth, if you worry about >>>>> that aspect? >>>> >>>> >>>> It's just how the snowball begins. >>> >>> >>> We should be able to control it. We also have to consider which one is >>> lighter for this particular use case. >>> >>>>>> which looks to be the same level of detail. I wrapped all this up in a >>>>>> perf interface once up a time... >>>>> >>>>> >>>>> How does that work? Via periodic sampling? Accuracy sounds like it >>>>> would be proportionate to the sampling frequency, no? >>>> >>>> >>>> Right, and the sampling frequency is under user control (via perf) with >>>> a default of around 1000, gives a small systematic error when dealing >>>> with % >>>> >>>> I included power, interrupts, rc6, frequency (and the statistics but I >>>> never used those and dropped them once oa landed), as well as >>>> utilisation, just for the convenience of having sane interface :) >>> >>> >>> Can you resurrect those patches? Don't have to rebase and all but I >>> would like to see them at least. >> >> Mind that the idea behind the requested kind of stats is primary usage >> by the customers in the _product_ environment to track GPU occupancy and >> predict based on this stats whether they can execute something else. >> Which means that 1) debugfs and any kind of debug-like infrastructure is > > > Yeah I acknowledged in the cover letter debugfs is not ideal. > > I could implement it in sysfs I suppose by doing time based transitions as > opposed to having explicit open/release hooks. It wouldn't make a > fundamental different to this RFC from the overhead point of view. > > But most importantly we need to see in detail how does Chris' perf based > idea looks like and does it fit your requirements. +1 on perf pmu, that sounds much more like the userspace interface you're looking for. If it's not that, then perhaps hand-rolled like the i915 OA stuff we now have (but starting out with a perf pmu sounds much better, at least for anything global which doesn't need to be per-context or per-batch). -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx