On 7/3/19 6:28 AM, Lucas Stach wrote: > drm_sched_entity_kill_jobs_cb() is called right from the last scheduled > job finished fence signaling. As this might happen from IRQ context we > now end up calling the GPU driver free_job callback in IRQ context, while > all other paths call it from normal process context. > > Etnaviv in particular calls core kernel functions that are only valid to > be called from process context when freeing the job. Other drivers might > have similar issues, but I did not validate this. Fix this by punting > the cleanup work into a work item, so the driver expectations are met. > > Signed-off-by: Lucas Stach <l.stach@xxxxxxxxxxxxxx> > --- > drivers/gpu/drm/scheduler/sched_entity.c | 28 ++++++++++++++---------- > 1 file changed, 17 insertions(+), 11 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c > index 35ddbec1375a..ba4eb66784b9 100644 > --- a/drivers/gpu/drm/scheduler/sched_entity.c > +++ b/drivers/gpu/drm/scheduler/sched_entity.c > @@ -202,23 +202,23 @@ long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout) > } > EXPORT_SYMBOL(drm_sched_entity_flush); > > -/** > - * drm_sched_entity_kill_jobs - helper for drm_sched_entity_kill_jobs > - * > - * @f: signaled fence > - * @cb: our callback structure > - * > - * Signal the scheduler finished fence when the entity in question is killed. > - */ > +static void drm_sched_entity_kill_work(struct work_struct *work) > +{ > + struct drm_sched_job *job = container_of(work, struct drm_sched_job, > + finish_work); > + > + drm_sched_fence_finished(job->s_fence); > + WARN_ON(job->s_fence->parent); > + job->sched->ops->free_job(job); > +} > + > static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, > struct dma_fence_cb *cb) > { > struct drm_sched_job *job = container_of(cb, struct drm_sched_job, > finish_cb); > > - drm_sched_fence_finished(job->s_fence); > - WARN_ON(job->s_fence->parent); > - job->sched->ops->free_job(job); > + schedule_work(&job->finish_work); > } > > /** > @@ -240,6 +240,12 @@ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity) > drm_sched_fence_scheduled(s_fence); > dma_fence_set_error(&s_fence->finished, -ESRCH); > > + /* > + * Replace regular finish work function with one that just > + * kills the job. > + */ > + job->finish_work.func = drm_sched_entity_kill_work; I rechecked the latest code and finish_work was removed in ffae3e5 'drm/scheduler: rework job destruction' Andrey > + > /* > * When pipe is hanged by older entity, new entity might > * not even have chance to submit it's first job to HW _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel