Am 08.10.2018 um 20:01 schrieb Nayan Deshmukh: > Thanks for all the explanations. Looks like this part of scheduler has > a lot of bugs. > > On Tue, Oct 9, 2018 at 2:55 AM Christian König > <ckoenig.leichtzumerken@xxxxxxxxx> wrote: >> Am 08.10.2018 um 19:40 schrieb Nayan Deshmukh: >>> Hi Christian, >>> >>> I am a bit confused with this patch. It will be better if you can >>> explain what these 3 functions are supposed to do. From my >>> understanding this is what they are supposed to do: >>> >>> 1. drm_sched_job_timedout: This function is called when a job has been >>> executing for more than "timeout" seconds. This function needs to >>> identify the wrong job and call the timedout_job function which will >>> notify the driver of that job and the gpu should be recovered to its >>> original state. >>> >>> 2. drm_sched_hw_job_reset: stop the scheduler if it contains the bad >>> job. ATM this function first removes all the callback and goes through >>> the ring_mirror_list afterward to find the bad job. I think it should >>> do it in the opposite order. Or my understanding of this function is >>> not fully correct. >> That function actually has a couple of bugs itself, but it is irrelevant >> for the current patch. >> >>> 3. drm_sched_job_recovery: recover jobs after a reset. It resubmits >>> the job to the hardware queue avoiding the guilty job. >> That function is completely unrelated and only called after recovering >> from a hard reset. >> >>> Some other questions and suggestions: >>> >>> 1. We can avoid the race condition altogether if we shift the >>> canceling and rescheduling of the timedout work item to the >>> drm_sched_process_job(). >> That won't work. drm_sched_process_job() is called from interrupt >> context and can't sync with a timeout worker. >> >>> 2. How does removing and adding the callback help with the race >>> condition? Moreover, the hardware might execute some jobs while we are >>> in this function leading to more races. >> We need to make sure that the underlying hardware fence doesn't signal >> and triggers new processing while we are about to call the drivers >> timeout function to reset the hardware. >> >> This is done by removing the callbacks in reverse order (e.g. newest to >> oldest). >> >> If we find that we can't remove the callback because the hardware has >> actually continued and the fence has already signaled we add back the >> callback again in normal order (e.g. oldest to newest) starting from the >> job which was already signaled. > Why do we need to call drm_sched_process_job() again for the last signaled job? Oh, good point, that is indeed incorrect. The label should be after we calling drm_sched_process_job() because we couldn't remove the callback for the current one. Going to fix that, Christian, > > Regards, > Nayan >>> 3. What is the order in which the above 3 functions should be executed >>> by the hardware? I think the answer to this question might clear a lot >>> of my doubts. >> If we can quickly recover from a problem only the >> drm_sched_job_timedout() should be execute. >> >> The other two are for hard resets where we need to stop multiple >> scheduler instances and get them running again. >> >> That the karma handling is mixed into that is rather unfortunate and >> actually quite buggy as well. >> >> I should probably also clean that up. >> >> Regards, >> Christian. >> >>> Regards, >>> Nayan >>> >>> On Mon, Oct 8, 2018 at 8:36 PM Christian König >>> <ckoenig.leichtzumerken@xxxxxxxxx> wrote: >>>> We need to make sure that we don't race between job completion and >>>> timeout. >>>> >>>> Signed-off-by: Christian König <christian.koenig@xxxxxxx> >>>> --- >>>> drivers/gpu/drm/scheduler/sched_main.c | 28 +++++++++++++++++++++++++++- >>>> 1 file changed, 27 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c >>>> index bd7d11c47202..ad3c57c9fd21 100644 >>>> --- a/drivers/gpu/drm/scheduler/sched_main.c >>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c >>>> @@ -248,14 +248,40 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job) >>>> static void drm_sched_job_timedout(struct work_struct *work) >>>> { >>>> struct drm_gpu_scheduler *sched; >>>> + struct drm_sched_fence *fence; >>>> struct drm_sched_job *job; >>>> + int r; >>>> >>>> sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); >>>> + >>>> + spin_lock(&sched->job_list_lock); >>>> + list_for_each_entry_reverse(job, &sched->ring_mirror_list, node) { >>>> + fence = job->s_fence; >>>> + if (!dma_fence_remove_callback(fence->parent, &fence->cb)) >>>> + goto already_signaled; >>>> + } >>>> + >>>> job = list_first_entry_or_null(&sched->ring_mirror_list, >>>> struct drm_sched_job, node); >>>> + spin_unlock(&sched->job_list_lock); >>>> >>>> if (job) >>>> - job->sched->ops->timedout_job(job); >>>> + sched->ops->timedout_job(job); >>>> + >>>> + spin_lock(&sched->job_list_lock); >>>> + list_for_each_entry(job, &sched->ring_mirror_list, node) { >>>> + fence = job->s_fence; >>>> + if (!fence->parent || !list_empty(&fence->cb.node)) >>>> + continue; >>>> + >>>> + r = dma_fence_add_callback(fence->parent, &fence->cb, >>>> + drm_sched_process_job); >>>> + if (r) >>>> +already_signaled: >>>> + drm_sched_process_job(fence->parent, &fence->cb); >>>> + >>>> + } >>>> + spin_unlock(&sched->job_list_lock); >>>> } >>>> >>>> /** >>>> -- >>>> 2.14.1 >>>> >>>> _______________________________________________ >>>> dri-devel mailing list >>>> dri-devel@xxxxxxxxxxxxxxxxxxxxx >>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel