On Thu, Jun 15, 2023 at 4:47 AM Dave Airlie <airlied@xxxxxxxxx> wrote: > > From: Dave Airlie <airlied@xxxxxxxxxx> > > This seems to have existed for ever but is now more apparant after > 9bff18d13473a9fdf81d5158248472a9d8ecf2bd (drm/ttm: use per BO cleanup workers) > > My analysis: > two threads are running, > one in the irq signalling the fence, in dma_fence_signal_timestamp_locked, > it has done the DMA_FENCE_FLAG_SIGNALLED_BIT setting, but hasn't yet reached the callbacks. > > second thread in nouveau_cli_work_ready, where it sees the fence is signalled, so then puts the > fence, cleanups the object and frees the work item, which contains the callback. > > thread one goes again and tries to call the callback and causes the use-after-free. > > Proposed fix: > lock the fence signalled check in nouveau_cli_work_ready, so either the callbacks are done > or the memory is freed. > > Signed-off-by: Dave Airlie <airlied@xxxxxxxxxx> Fixes: 11e451e74050 ("drm/nouveau: remove fence wait code from deferred client work handler") Is I think the most reasonable commit to tag to ensure backports. > --- > drivers/gpu/drm/nouveau/nouveau_drm.c | 14 ++++++++++---- > 1 file changed, 10 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c > index cc7c5b4a05fd..1a45be769848 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_drm.c > +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c > @@ -137,10 +137,16 @@ nouveau_name(struct drm_device *dev) > static inline bool > nouveau_cli_work_ready(struct dma_fence *fence) > { > - if (!dma_fence_is_signaled(fence)) > - return false; > - dma_fence_put(fence); > - return true; > + unsigned long flags; > + bool ret = true; > + spin_lock_irqsave(fence->lock, flags); > + if (!dma_fence_is_signaled_locked(fence)) > + ret = false; > + spin_unlock_irqrestore(fence->lock, flags); I agree with the code being always broken as `dma_fence_is_signaled_locked` specifies: "This function requires &dma_fence.lock to be held." > + > + if (ret == true) > + dma_fence_put(fence); > + return ret; > } > > static void > -- > 2.40.1 > regardless of the patch fixing the issue users have seen: Reviewed-by: Karol Herbst <kherbst@xxxxxxxxxx>