On Fri, 25 Jun 2021 16:55:12 +0100 Steven Price <steven.price@xxxxxxx> wrote: > On 25/06/2021 14:33, Boris Brezillon wrote: > > This is not yet needed because we let active jobs be killed during by > > the reset and we don't really bother making sure they can be restarted. > > But once we start adding soft-stop support, controlling when we deal > > with the remaining interrrupts and making sure those are handled before > > the reset is issued gets tricky if we keep job interrupts active. > > > > Let's prepare for that and mask+flush job IRQs before issuing a reset. > > > > Signed-off-by: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx> > > --- > > drivers/gpu/drm/panfrost/panfrost_job.c | 21 +++++++++++++++------ > > 1 file changed, 15 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c > > index 88d34fd781e8..0566e2f7e84a 100644 > > --- a/drivers/gpu/drm/panfrost/panfrost_job.c > > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c > > @@ -34,6 +34,7 @@ struct panfrost_queue_state { > > struct panfrost_job_slot { > > struct panfrost_queue_state queue[NUM_JOB_SLOTS]; > > spinlock_t job_lock; > > + int irq; > > }; > > > > static struct panfrost_job * > > @@ -400,7 +401,15 @@ static void panfrost_reset(struct panfrost_device *pfdev, > > if (bad) > > drm_sched_increase_karma(bad); > > > > - spin_lock(&pfdev->js->job_lock); > > I'm not sure it's safe to remove this lock as this protects the > pfdev->jobs array: I can't see what would prevent panfrost_job_close() > running at the same time without the lock. Am I missing something? Ah, you're right, I'll add it back. > > > + /* Mask job interrupts and synchronize to make sure we won't be > > + * interrupted during our reset. > > + */ > > + job_write(pfdev, JOB_INT_MASK, 0); > > + synchronize_irq(pfdev->js->irq); > > + > > + /* Schedulers are stopped and interrupts are masked+flushed, we don't > > + * need to protect the 'evict unfinished jobs' lock with the job_lock. > > + */ > > for (i = 0; i < NUM_JOB_SLOTS; i++) { > > if (pfdev->jobs[i]) { > > pm_runtime_put_noidle(pfdev->dev); > > @@ -408,7 +417,6 @@ static void panfrost_reset(struct panfrost_device *pfdev, > > pfdev->jobs[i] = NULL; > > } > > } > > - spin_unlock(&pfdev->js->job_lock); > > > > panfrost_device_reset(pfdev); > > > > @@ -504,6 +512,7 @@ static void panfrost_job_handle_irq(struct panfrost_device *pfdev, u32 status) > > > > job = pfdev->jobs[j]; > > /* Only NULL if job timeout occurred */ > > + WARN_ON(!job); > > Was this WARN_ON intentional? Yes, now that we mask and synchronize the irq in the reset I don't see any reason why we would end up with an event but no job to attach this even to, but maybe I missed something.