Re: [RFC] drm/i915: Emit to ringbuffer directly

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> · Fri, 9 Sep 2016 14:40:51 +0100

On Fri, Sep 09, 2016 at 09:32:50AM +0100, Tvrtko Ursulin wrote:
> 
> On 08/09/16 17:40, Chris Wilson wrote:
> >On Thu, Sep 08, 2016 at 04:12:55PM +0100, Tvrtko Ursulin wrote:
> >>From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
> >>
> >>This removes the usage of intel_ring_emit in favour of
> >>directly writing to the ring buffer.
> >
> >I have the same patch! But I called it out, for historical reasons.
> 
> Yes I know we talked about it in the past but I did not think you
> will find time to actually write it amongst all the other things.
> 
> >Oh, except mine uses out[0]...out[N] because gcc prefers that over
> >*out++ = ...
> 
> It copes just fine with the latter here, for example:
> 
> 	*rbuf++ = cmd;
> 	*rbuf++ = I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT;
> 	*rbuf++ = 0; /* upper addr */
> 	*rbuf++ = 0; /* value */
> 
> Is:
> 
>      3e9:       89 10                   mov    %edx,(%rax)
>      3eb:       c7 40 04 04 01 00 00    movl   $0x104,0x4(%rax)
>      3f2:       c7 40 08 00 00 00 00    movl   $0x0,0x8(%rax)
>      3f9:       c7 40 0c 00 00 00 00    movl   $0x0,0xc(%rax)

Great. Last time we had a conversation about this, and when we looked at
constructing batchbuffers in userpspace, gcc was still generating two
instuctions (*ptr followed by ptr++) rather than emitting the mov to a
fixed offset for that sequence.

> >plus an ealier
> >
> >  drivers/gpu/drm/i915/i915_gem_request.c |  26 ++---
> >  drivers/gpu/drm/i915/intel_lrc.c        | 121 ++++++++---------------
> >  drivers/gpu/drm/i915/intel_ringbuffer.c | 168 +++++++++++---------------------
> >  drivers/gpu/drm/i915/intel_ringbuffer.h |  10 +-
> >  4 files changed, 112 insertions(+), 213 deletions(-)
> >
> >since I wanted parts of it for emitting timelines.
> 
> Ok what do you want to do?

I have plans to use that particular patch soon, but updating
intel_ring_begin() itself is a long way down my list. Given that you have
a patch ready, let's keep going. I'm just curious as to what I did
differently to trim off the extra lines (probably intel_ring_advance()). 
The other thing I did was to relax the restriction to only emit in qword 
aligned packets (by fixing up the tail for qword alignment on sealing the
request). Also, I would rather the function be expressed as operating on
the request, i915_gem_request_emit() was my choice.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx