Am Mittwoch, den 03.07.2019, 14:23 +0000 schrieb Grodzovsky, Andrey: > On 7/3/19 6:28 AM, Lucas Stach wrote: > > drm_sched_entity_kill_jobs_cb() is called right from the last scheduled > > job finished fence signaling. As this might happen from IRQ context we > > now end up calling the GPU driver free_job callback in IRQ context, while > > all other paths call it from normal process context. > > > > Etnaviv in particular calls core kernel functions that are only valid to > > be called from process context when freeing the job. Other drivers might > > have similar issues, but I did not validate this. Fix this by punting > > the cleanup work into a work item, so the driver expectations are met. > > > > > > Signed-off-by: Lucas Stach <l.stach@xxxxxxxxxxxxxx> > > --- > > drivers/gpu/drm/scheduler/sched_entity.c | 28 ++++++++++++++---------- > > 1 file changed, 17 insertions(+), 11 deletions(-) > > > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c > > index 35ddbec1375a..ba4eb66784b9 100644 > > --- a/drivers/gpu/drm/scheduler/sched_entity.c > > +++ b/drivers/gpu/drm/scheduler/sched_entity.c > > @@ -202,23 +202,23 @@ long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout) > > } > > EXPORT_SYMBOL(drm_sched_entity_flush); > > > > -/** > > - * drm_sched_entity_kill_jobs - helper for drm_sched_entity_kill_jobs > > - * > > - * @f: signaled fence > > - * @cb: our callback structure > > - * > > - * Signal the scheduler finished fence when the entity in question is killed. > > - */ > > +static void drm_sched_entity_kill_work(struct work_struct *work) > > +{ > > > > + struct drm_sched_job *job = container_of(work, struct drm_sched_job, > > > > + finish_work); > > + > > > > + drm_sched_fence_finished(job->s_fence); > > > > + WARN_ON(job->s_fence->parent); > > > > + job->sched->ops->free_job(job); > > +} > > + > > static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, > > > > struct dma_fence_cb *cb) > > { > > > > struct drm_sched_job *job = container_of(cb, struct drm_sched_job, > > > > finish_cb); > > > > > > - drm_sched_fence_finished(job->s_fence); > > > > - WARN_ON(job->s_fence->parent); > > > > - job->sched->ops->free_job(job); > > > > + schedule_work(&job->finish_work); > > } > > > > /** > > @@ -240,6 +240,12 @@ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity) > > > > drm_sched_fence_scheduled(s_fence); > > > > dma_fence_set_error(&s_fence->finished, -ESRCH); > > > > > > + /* > > > > + * Replace regular finish work function with one that just > > > > + * kills the job. > > > > + */ > > + job->finish_work.func = drm_sched_entity_kill_work; > > > I rechecked the latest code and finish_work was removed in ffae3e5 > 'drm/scheduler: rework job destruction' Aw, thanks. Seems this patch was stuck for a bit too long in my outgoing queue. I've just checked the commit you pointed out, it should also fix the issue that this patch was trying to fix. Regards, Lucas _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel