On Tue Aug 17, 2021 at 03:43:58PM +0200, Christian König wrote: > > > Am 17.08.21 um 15:37 schrieb Andrey Grodzovsky: > > On 2021-08-17 12:28 a.m., Jingwen Chen wrote: > > > [Why] > > > for bailing job, this commit will delete it from pending list thus the > > > bailing job will never have a chance to be resubmitted even in advance > > > tdr mode. > > > > > > [How] > > > after embeded hw_fence into amdgpu_job is done, the race condition that > > > this commit tries to work around is completely solved.So revert this > > > commit. > > > This reverts commit 135517d3565b48f4def3b1b82008bc17eb5d1c90. > > > > > > Can you elaborate please how this solves the race ? > > As far as I see and with this patch reverted, in drm_sched_job_timedout > > you get a pointer to next job to process in timed out handler, > > immediately > > next this job is actually finished and it's fence signaled, this in turn > > triggers > > drm_sched_get_cleanup_job which fetches this job and returns to Hi Andrey, if drm_sched_job_timedout is triggered first, drm_sched_get_cleanup_job will return NULL when the timeout worker is running according to this code: if ((sched->timeout != MAX_SCHEDULE_TIMEOUT && !cancel_delayed_work(&sched->work_tdr)) || kthread_should_park()) return NULL; But yes a dma_fence_get(job->s_fence->parent) is needed before processing timedout_job. When the bad job is signaled just before processing, the amdgpu_fence_free can be triggered by the dma_fence_put() of HW fence. And if I'm understanding this race condition correctly, the spin_lock is still needed here to avoid the drm_sched_get_cleanup_job get the spin_lock first and then enter the tdr work. > > drm_sched_main > > which in turn call free_job callabck->...->amdgpu_fence_free which frees > > the job > > from the HW dma_fence release callback. After that you proceed with a > > freed job > > in timed out handler. > > > > If you could take the HW fence reference from drm_sched_job_timedout > > before > > starting processing then yes, I think it would work. > > Yes, precisely that's what I had in mind as well and seems to be missing > from this patch. > > Regards, > Christian. > > > > > Andrey > > > > > > > > > > Signed-off-by: Jingwen Chen <Jingwen.Chen2@xxxxxxx> > > > --- > > > drivers/gpu/drm/scheduler/sched_main.c | 27 -------------------------- > > > 1 file changed, 27 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > > > b/drivers/gpu/drm/scheduler/sched_main.c > > > index a2a953693b45..31d1176d939f 100644 > > > --- a/drivers/gpu/drm/scheduler/sched_main.c > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c > > > @@ -317,21 +317,10 @@ static void drm_sched_job_timedout(struct > > > work_struct *work) > > > enum drm_gpu_sched_stat status = DRM_GPU_SCHED_STAT_NOMINAL; > > > sched = container_of(work, struct drm_gpu_scheduler, > > > work_tdr.work); > > > - > > > - /* Protects against concurrent deletion in > > > drm_sched_get_cleanup_job */ > > > - spin_lock(&sched->job_list_lock); > > > job = list_first_entry_or_null(&sched->pending_list, > > > struct drm_sched_job, list); > > > if (job) { > > > - /* > > > - * Remove the bad job so it cannot be freed by concurrent > > > - * drm_sched_cleanup_jobs. It will be reinserted back after > > > sched->thread > > > - * is parked at which point it's safe. > > > - */ > > > - list_del_init(&job->list); > > > - spin_unlock(&sched->job_list_lock); > > > - > > > status = job->sched->ops->timedout_job(job); > > > /* > > > @@ -342,8 +331,6 @@ static void drm_sched_job_timedout(struct > > > work_struct *work) > > > job->sched->ops->free_job(job); > > > sched->free_guilty = false; > > > } > > > - } else { > > > - spin_unlock(&sched->job_list_lock); > > > } > > > if (status != DRM_GPU_SCHED_STAT_ENODEV) { > > > @@ -392,20 +379,6 @@ void drm_sched_stop(struct drm_gpu_scheduler > > > *sched, struct drm_sched_job *bad) > > > kthread_park(sched->thread); > > > - /* > > > - * Reinsert back the bad job here - now it's safe as > > > - * drm_sched_get_cleanup_job cannot race against us and release the > > > - * bad job at this point - we parked (waited for) any in progress > > > - * (earlier) cleanups and drm_sched_get_cleanup_job will not be > > > called > > > - * now until the scheduler thread is unparked. > > > - */ > > > - if (bad && bad->sched == sched) > > > - /* > > > - * Add at the head of the queue to reflect it was the earliest > > > - * job extracted. > > > - */ > > > - list_add(&bad->list, &sched->pending_list); > > > - > > > /* > > > * Iterate the job list from later to earlier one and either > > > deactive > > > * their HW callbacks or remove them from pending list if they > > > already >