On Wed, 8 Jan 2025 at 02:02, Danilo Krummrich <dakr@xxxxxxxxxx> wrote: > > On Tue, Jan 07, 2025 at 03:58:46PM +1000, Dave Airlie wrote: > > From: Dave Airlie <airlied@xxxxxxxxxx> > > > > If we have two nouveau controlled devices and one passes a dma-fence > > to the other, when we hit the sync path it can cause the second device > > to try and put a sync wait in it's pushbuf for the seqno of the context > > on the first device. > > > > Since fence contexts are vmm bound, check the if vmm's match between > > both users, this should ensure that fence seqnos don't get used wrongly > > on incorrect channels. > > The fence sequence number is global, i.e. per device, hence checking the vmm > context seems too restrictive. > > Wouldn't it be better to ensure that `prev->cli->drm == chan->cli->drm`? Can you prove that? I thought the same and I've gone around a few times yesterday/today and convinced myself what I wrote is right. dma_fence_init gets passed the seqno which comes from fctx->sequence, which is nouveau_fence_chan, which gets allocated for each channel. So we should hit this path if we have 2 userspace submits, one with say graphics, the one with copy engine contexts, otherwise we should wait on the CPU. > > drivers/gpu/drm/nouveau/nouveau_fence.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c > > index ee5e9d40c166f..5743c82f4094b 100644 > > --- a/drivers/gpu/drm/nouveau/nouveau_fence.c > > +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c > > @@ -370,7 +370,8 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan, > > > > rcu_read_lock(); > > prev = rcu_dereference(f->channel); > > - if (prev && (prev == chan || > > + if (prev && (prev->vmm == chan->vmm) && > > + (prev == chan || > > Maybe better break it down a bit, e.g. > > bool local = prev && (prev->... == chan->...); > > if (local && ...) { > ... > } I'll update that once we resolve the above. Dave.