Re: [PATCH] drm/i915: Handle Intel igfx + Intel dgfx hybrid graphics setup

Daniel Vetter <daniel@xxxxxxxx> · Wed, 13 Oct 2021 14:06:39 +0200

On Tue, Oct 05, 2021 at 03:05:25PM +0200, Thomas Hellström wrote:
> Hi, Tvrtko,
> 
> On 10/5/21 13:31, Tvrtko Ursulin wrote:
> > From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
> > 
> > In short this makes i915 work for hybrid setups (DRI_PRIME=1 with Mesa)
> > when rendering is done on Intel dgfx and scanout/composition on Intel
> > igfx.
> > 
> > Before this patch the driver was not quite ready for that setup, mainly
> > because it was able to emit a semaphore wait between the two GPUs, which
> > results in deadlocks because semaphore target location in HWSP is neither
> > shared between the two, nor mapped in both GGTT spaces.
> > 
> > To fix it the patch adds an additional check to a couple of relevant code
> > paths in order to prevent using semaphores for inter-engine
> > synchronisation when relevant objects are not in the same GGTT space.
> > 
> > v2:
> >   * Avoid adding rq->i915. (Chris)
> > 
> > v3:
> >   * Use GGTT which describes the limit more precisely.
> > 
> > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
> > Cc: Daniel Vetter <daniel.vetter@xxxxxxxx>
> > Cc: Matthew Auld <matthew.auld@xxxxxxxxx>
> > Cc: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx>
> 
> An IMO pretty important bugfix. I read up a bit on the previous discussion
> on this, and from what I understand the other two options were
> 
> 1) Ripping out the semaphore code,
> 2) Consider dma-fences from other instances of the same driver as foreign.
> 
> For imported dma-bufs we do 2), but particularly with lmem and p2p that's a
> more straightforward decision.
> 
> I don't think 1) is a reasonable approach to fix this bug, (but perhaps as a
> general cleanup?), and for 2) yes I guess we might end up doing that, unless
> we find some real benefits in treating same-driver-separate-device
> dma-fences as local, but for this particular bug, IMO this is a reasonable
> fix.

The foreign dma-fences have uapi impact, which Tvrtko shrugged off as
"it's a good idea", and not it's really just not. So we still need to that
this properly.

> Reviewed-by: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx>

But I'm also ok with just merging this as-is so the situation doesn't
become too entertaining.
-Daniel

> 
> 
> 
> 
> 
> > ---
> >   drivers/gpu/drm/i915/i915_request.c | 12 +++++++++++-
> >   1 file changed, 11 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index 79da5eca60af..4f189982f67e 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -1145,6 +1145,12 @@ __emit_semaphore_wait(struct i915_request *to,
> >   	return 0;
> >   }
> > +static bool
> > +can_use_semaphore_wait(struct i915_request *to, struct i915_request *from)
> > +{
> > +	return to->engine->gt->ggtt == from->engine->gt->ggtt;
> > +}
> > +
> >   static int
> >   emit_semaphore_wait(struct i915_request *to,
> >   		    struct i915_request *from,
> > @@ -1153,6 +1159,9 @@ emit_semaphore_wait(struct i915_request *to,
> >   	const intel_engine_mask_t mask = READ_ONCE(from->engine)->mask;
> >   	struct i915_sw_fence *wait = &to->submit;
> > +	if (!can_use_semaphore_wait(to, from))
> > +		goto await_fence;
> > +
> >   	if (!intel_context_use_semaphores(to->context))
> >   		goto await_fence;
> > @@ -1256,7 +1265,8 @@ __i915_request_await_execution(struct i915_request *to,
> >   	 * immediate execution, and so we must wait until it reaches the
> >   	 * active slot.
> >   	 */
> > -	if (intel_engine_has_semaphores(to->engine) &&
> > +	if (can_use_semaphore_wait(to, from) &&
> > +	    intel_engine_has_semaphores(to->engine) &&
> >   	    !i915_request_has_initial_breadcrumb(to)) {
> >   		err = __emit_semaphore_wait(to, from, from->fence.seqno - 1);
> >   		if (err < 0)

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch