On 21/06/2021 14:39, Boris Brezillon wrote: > panfrost_reset() does not directly signal fences, but > panfrost_scheduler_start() does, when calling drm_sched_start(). I have to admit to not fully understanding dma_fence_begin_signalling() - but I thought the idea was that it should have a relatively wide length to actually catch locking bugs. Just wrapping drm_sched_start() looks wrong: i.e. why isn't this code just contained in drm_sched_start()? The relevant section from the docs reads: > * * All code necessary to complete a &dma_fence must be annotated, from the > * point where a fence is accessible to other threads, to the point where > * dma_fence_signal() is called. Un-annotated code can contain deadlock issues, > * and due to the very strict rules and many corner cases it is infeasible to > * catch these just with review or normal stress testing So it makes sense that we annotate code from when the reset is started to after the signalling has happened. That way if there are any locks taken in the reset path which could be blocked waiting for any of the fences which might be signalled we get moaned at by lockdep. Steve > Signed-off-by: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx> > --- > drivers/gpu/drm/panfrost/panfrost_job.c | 7 +++---- > 1 file changed, 3 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c > index 74b63e1ee6d9..cf6abe0fdf47 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_job.c > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c > @@ -414,6 +414,7 @@ static bool panfrost_scheduler_stop(struct panfrost_queue_state *queue, > static void panfrost_scheduler_start(struct panfrost_queue_state *queue) > { > enum panfrost_queue_status old_status; > + bool cookie; > > mutex_lock(&queue->lock); > old_status = atomic_xchg(&queue->status, > @@ -423,7 +424,9 @@ static void panfrost_scheduler_start(struct panfrost_queue_state *queue) > /* Restore the original timeout before starting the scheduler. */ > queue->sched.timeout = msecs_to_jiffies(JOB_TIMEOUT_MS); > drm_sched_resubmit_jobs(&queue->sched); > + cookie = dma_fence_begin_signalling(); > drm_sched_start(&queue->sched, true); > + dma_fence_end_signalling(cookie); > old_status = atomic_xchg(&queue->status, > PANFROST_QUEUE_STATUS_ACTIVE); > if (old_status == PANFROST_QUEUE_STATUS_FAULT_PENDING) > @@ -566,9 +569,7 @@ static void panfrost_reset(struct work_struct *work) > reset.work); > unsigned long flags; > unsigned int i; > - bool cookie; > > - cookie = dma_fence_begin_signalling(); > for (i = 0; i < NUM_JOB_SLOTS; i++) { > /* > * We want pending timeouts to be handled before we attempt > @@ -608,8 +609,6 @@ static void panfrost_reset(struct work_struct *work) > > for (i = 0; i < NUM_JOB_SLOTS; i++) > panfrost_scheduler_start(&pfdev->js->queue[i]); > - > - dma_fence_end_signalling(cookie); > } > > int panfrost_job_init(struct panfrost_device *pfdev) >