Re: [PATCH] nouveau: offload fence uevents work to workqueue

Daniel Vetter <daniel@xxxxxxxx> · Tue, 6 Feb 2024 15:03:05 +0100

On Mon, Feb 05, 2024 at 11:00:04PM +0100, Danilo Krummrich wrote:
> On 2/5/24 22:08, Dave Airlie wrote:
> > On Tue, 6 Feb 2024 at 02:22, Danilo Krummrich <dakr@xxxxxxxxxx> wrote:
> > > 
> > > On 1/29/24 02:50, Dave Airlie wrote:
> > > > From: Dave Airlie <airlied@xxxxxxxxxx>
> > > > 
> > > > This should break the deadlock between the fctx lock and the irq lock.
> > > > 
> > > > This offloads the processing off the work from the irq into a workqueue.
> > > > 
> > > > Signed-off-by: Dave Airlie <airlied@xxxxxxxxxx>
> > > 
> > > Nouveau's scheduler uses a dedicated wq, hence from this perspective it's
> > > safe deferring fence signalling to the kernel global wq. However, I wonder
> > > if we could create deadlocks by building dependency chains into other
> > > drivers / kernel code that, by chance, makes use of the kernel global wq as
> > > well.
> > > 
> > > Admittedly, even if, it's gonna be extremely unlikely given that
> > > WQ_MAX_ACTIVE == 512. But maybe it'd be safer to use a dedicated wq.
> > > 
> > > Also, do we need to CC stable?
> > 
> > I pushed this to Linus at the end of last week, since the hangs in 6.7
> > take out the complete system and I wanted it in stable.
> > 
> > It might be safer to use a dedicated wq, is the concern someone is
> > waiting on a fence in a workqueue somewhere else so we will never
> > signal it?
> 
> Yes, if some other work is waiting for this fence (or something else in the same
> dependency chain) to signal it can prevent executing the work signaling this fence,
> in case both are scheduled on the same wq. As mentioned, with the kernel global wq
> this would be rather unlikely to lead to an actual stall with WQ_MAX_ACTIVE == 512,
> but formally the race condition exists. I guess a malicious attacker could try to
> intentionally push jobs directly or indirectly depending on this fence to a driver
> which queues them up on a scheduler using the kernel global wq.

I think if you add dma_fence_signalling annotations (aside, there's some
patch from iirc Thomas Hellstrom to improve them and cut down on some
false positives, but I lost track) then I think you won't get any splats
because the wq subsystem assumes that WC_MAX_ACTIVE is close enough to
infinity to not matter.

I'm not sure we should care differently, but I guess it'd be good to
annotate it all in case the wq subsystem's idea of how much such deadlocks
are real changes.

Also Teo is on a mission to get rid of all the global wq flushes, so there
should also be no source of deadlocks from that kind of cross-driver
dependency. Or at least shouldn't be in the future, I'm not sure it all
landed.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch