On Tue, Oct 05, 2021 at 03:05:25PM +0200, Thomas Hellström wrote: > Hi, Tvrtko, > > On 10/5/21 13:31, Tvrtko Ursulin wrote: > > From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > > > In short this makes i915 work for hybrid setups (DRI_PRIME=1 with Mesa) > > when rendering is done on Intel dgfx and scanout/composition on Intel > > igfx. > > > > Before this patch the driver was not quite ready for that setup, mainly > > because it was able to emit a semaphore wait between the two GPUs, which > > results in deadlocks because semaphore target location in HWSP is neither > > shared between the two, nor mapped in both GGTT spaces. > > > > To fix it the patch adds an additional check to a couple of relevant code > > paths in order to prevent using semaphores for inter-engine > > synchronisation when relevant objects are not in the same GGTT space. > > > > v2: > > * Avoid adding rq->i915. (Chris) > > > > v3: > > * Use GGTT which describes the limit more precisely. > > > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > Cc: Daniel Vetter <daniel.vetter@xxxxxxxx> > > Cc: Matthew Auld <matthew.auld@xxxxxxxxx> > > Cc: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> > > An IMO pretty important bugfix. I read up a bit on the previous discussion > on this, and from what I understand the other two options were > > 1) Ripping out the semaphore code, > 2) Consider dma-fences from other instances of the same driver as foreign. > > For imported dma-bufs we do 2), but particularly with lmem and p2p that's a > more straightforward decision. > > I don't think 1) is a reasonable approach to fix this bug, (but perhaps as a > general cleanup?), and for 2) yes I guess we might end up doing that, unless > we find some real benefits in treating same-driver-separate-device > dma-fences as local, but for this particular bug, IMO this is a reasonable > fix. The foreign dma-fences have uapi impact, which Tvrtko shrugged off as "it's a good idea", and not it's really just not. So we still need to that this properly. > Reviewed-by: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> But I'm also ok with just merging this as-is so the situation doesn't become too entertaining. -Daniel > > > > > > > --- > > drivers/gpu/drm/i915/i915_request.c | 12 +++++++++++- > > 1 file changed, 11 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c > > index 79da5eca60af..4f189982f67e 100644 > > --- a/drivers/gpu/drm/i915/i915_request.c > > +++ b/drivers/gpu/drm/i915/i915_request.c > > @@ -1145,6 +1145,12 @@ __emit_semaphore_wait(struct i915_request *to, > > return 0; > > } > > +static bool > > +can_use_semaphore_wait(struct i915_request *to, struct i915_request *from) > > +{ > > + return to->engine->gt->ggtt == from->engine->gt->ggtt; > > +} > > + > > static int > > emit_semaphore_wait(struct i915_request *to, > > struct i915_request *from, > > @@ -1153,6 +1159,9 @@ emit_semaphore_wait(struct i915_request *to, > > const intel_engine_mask_t mask = READ_ONCE(from->engine)->mask; > > struct i915_sw_fence *wait = &to->submit; > > + if (!can_use_semaphore_wait(to, from)) > > + goto await_fence; > > + > > if (!intel_context_use_semaphores(to->context)) > > goto await_fence; > > @@ -1256,7 +1265,8 @@ __i915_request_await_execution(struct i915_request *to, > > * immediate execution, and so we must wait until it reaches the > > * active slot. > > */ > > - if (intel_engine_has_semaphores(to->engine) && > > + if (can_use_semaphore_wait(to, from) && > > + intel_engine_has_semaphores(to->engine) && > > !i915_request_has_initial_breadcrumb(to)) { > > err = __emit_semaphore_wait(to, from, from->fence.seqno - 1); > > if (err < 0) -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch