Re: pin OABUFFER to GGTT

"Bragg, Robert" <robert.bragg@xxxxxxxxx> · Tue, 8 Jul 2014 00:59:13 +0100

On Mon, Jul 7, 2014 at 9:43 PM, Daniel Vetter <daniel@xxxxxxxx> wrote:

On Tue, Jul 01, 2014 at 08:54:27PM +0100, Chris Wilson wrote:

> On Tue, Jul 01, 2014 at 05:16:30PM +0000, Mateo Lozano, Oscar wrote:

> > > The issue is they need:

> > >

> > > A) A buffer object.

> > > B) Bound to GGTT.

> > > C) That userspace knows the GGTT offset of, so that they can program

> > > OABUFFER with it.

> > > D) That userspace can map so that they can read the reported counters.

> > >

> > > They used to create a bo, call bo_pin on it, use args->offset to program

> > > OABUFFER (via MI_LOAD_REGISTER_IMM, I imagine), map it and read the

> > > counter values. They cannot do this anymore.

> >

> > The answer might be that all of this needs to be done by the kernel

> > itself, but then we need to provide an interface to userspace...

>

> Yes. If you need to pin a buffer for a register, then it needs to be

> handled by the kernel. Especially one that provides information about

> other users.

Short-circuiting the entire discussion here. Afaik there's two OA modes:

- inline with the batch with MI_REPORT_PERF

- global with the ringbuffer setup with the OABUFFER registers

There's one other; we can read the counters via mmio for HSW+ too. I've found that quite convenient for experimenting with capturing OA counters from userspace. A notable disadvantage  with reading via mmio though is that there's no latch and hold mechanism to  make sure the counters are frozen while they are read back-to-back.

The later should indeed be fully controlled by the kernel as Chris

suggested and exposed as an off-cpu performance monitoring unit through

the perf subsystem. Chris has rfc patches floating somewhere to do this

for other gpu perf data.

Just to let folks know; I've recently been starting to play around with Chris' perf patch plus a couple of small patches on top and was planning on experimenting with exposing the oa counters this way soon.

I was hoping to try and pick up the discussion of this patch soon too and give some comments, but I'd also like to collect some concrete data as a reference point first, to be more confident in my own understanding of how things are behaving.

One fun thing here is the coordination between these two OA modes since

iirc they both use the same setup registers for the performance counter

configuration. No idea yet how to solve this. 

But really userspace shouldn't program ggtt offset, not even

debug/performance measuring tools.

I just wanted to pop my head up here just so others are aware that I'm another person looking at this area, aiming to understand how best to make this data available to both GL and tools; initially considering Mesa's performance query extensions (that don't always report reliable data currently) and tools like intel-gpu-top.

--
Regards,
Robert

-Daniel

--

Daniel Vetter

Software Engineer, Intel Corporation

+41 (0) 79 365 57 48 - http://blog.ffwll.ch

_______________________________________________

Intel-gfx mailing list

Intel-gfx@xxxxxxxxxxxxxxxxxxxxx

http://lists.freedesktop.org/mailman/listinfo/intel-gfx

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx