On 4/16/19 10:58 AM, Grodzovsky, Andrey wrote: > On 4/16/19 10:43 AM, Koenig, Christian wrote: >> Am 16.04.19 um 16:36 schrieb Grodzovsky, Andrey: >>> On 4/16/19 5:47 AM, Christian König wrote: >>>> Am 15.04.19 um 23:17 schrieb Eric Anholt: >>>>> Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx> writes: >>>>> >>>>>> From: Christian König <christian.koenig@xxxxxxx> >>>>>> >>>>>> We now destroy finished jobs from the worker thread to make sure that >>>>>> we never destroy a job currently in timeout processing. >>>>>> By this we avoid holding lock around ring mirror list in drm_sched_stop >>>>>> which should solve a deadlock reported by a user. >>>>>> >>>>>> v2: Remove unused variable. >>>>>> >>>>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109692 >>>>>> >>>>>> Signed-off-by: Christian König <christian.koenig@xxxxxxx> >>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx> >>>>>> --- >>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 17 ++-- >>>>>> drivers/gpu/drm/etnaviv/etnaviv_dump.c | 4 - >>>>>> drivers/gpu/drm/etnaviv/etnaviv_sched.c | 9 +- >>>>>> drivers/gpu/drm/scheduler/sched_main.c | 138 >>>>>> +++++++++++++++++------------ >>>>>> drivers/gpu/drm/v3d/v3d_sched.c | 9 +- >>>>> Missing corresponding panfrost and lima updates. You should probably >>>>> pull in drm-misc for hacking on the scheduler. >>>>> >>>>>> diff --git a/drivers/gpu/drm/v3d/v3d_sched.c >>>>>> b/drivers/gpu/drm/v3d/v3d_sched.c >>>>>> index ce7c737b..8efb091 100644 >>>>>> --- a/drivers/gpu/drm/v3d/v3d_sched.c >>>>>> +++ b/drivers/gpu/drm/v3d/v3d_sched.c >>>>>> @@ -232,11 +232,18 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, >>>>>> struct drm_sched_job *sched_job) >>>>>> /* block scheduler */ >>>>>> for (q = 0; q < V3D_MAX_QUEUES; q++) >>>>>> - drm_sched_stop(&v3d->queue[q].sched); >>>>>> + drm_sched_stop(&v3d->queue[q].sched, sched_job); >>>>>> if(sched_job) >>>>>> drm_sched_increase_karma(sched_job); >>>>>> + /* >>>>>> + * Guilty job did complete and hence needs to be manually removed >>>>>> + * See drm_sched_stop doc. >>>>>> + */ >>>>>> + if (list_empty(&sched_job->node)) >>>>>> + sched_job->sched->ops->free_job(sched_job); >>>>> If the if (sched_job) is necessary up above, then this should clearly be >>>>> under it. >>>>> >>>>> But, can we please have a core scheduler thing we call here instead of >>>>> drivers all replicating it? >>>> Yeah that's also something I noted before. >>>> >>>> Essential problem is that we remove finished jobs from the mirror list >>>> and so need to destruct them because we otherwise leak them. >>>> >>>> Alternative approach here would be to keep the jobs on the ring mirror >>>> list, but not submit them again. >>>> >>>> Regards, >>>> Christian. >>> I really prefer to avoid this, it means adding extra flag to sched_job >>> to check in each iteration of the ring mirror list. >> Mhm, why actually? We just need to check if the scheduler fence is signaled. > OK, i see it's equivalent but this still en extra check for all the > iterations. > >>> What about changing >>> signature of drm_sched_backend_ops.timedout_job to return drm_sched_job* >>> instead of void, this way we can return the guilty job back from the >>> driver specific handler to the generic drm_sched_job_timedout and >>> release it there. >> Well the timeout handler already has the job, so returning it doesn't >> make much sense. >> >> The problem is rather that the timeout handler doesn't know if it should >> destroy the job or not. > > But the driver specific handler does, and actually returning back either > the pointer to the job or null will give an indication of that. We can > even return bool. > > Andrey Thinking a bit more about this - the way this check is done now "if (list_empty(&sched_job->node)) then free the sched_job" actually makes it possible to just move this as is from driver specific callbacks into drm_sched_job_timeout without any other changes. Andrey > >> Christian. >> >>> Andrey >>> >>>>>> + >>>>>> /* get the GPU back into the init state */ >>>>>> v3d_reset(v3d); >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx@xxxxxxxxxxxxxxxxxxxxx >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > _______________________________________________ > amd-gfx mailing list > amd-gfx@xxxxxxxxxxxxxxxxxxxxx > https://lists.freedesktop.org/mailman/listinfo/amd-gfx _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx