[PATCH] drm/scheduler: fix timeout worker setup for out of order job completions

christian.koenig@xxxxxxx (Christian König) · Fri, 3 Aug 2018 16:24:54 +0200

Am 03.08.2018 um 16:09 schrieb Lucas Stach:
> Hi Christian,
>
> Am Freitag, den 03.08.2018, 15:51 +0200 schrieb Christian KÃ¶nig:
>> Hi Lucas,
>>
>> thanks a lot for taking care of that, but there is one thing you have
>> missed:
>>
>> It is perfectly possible that the job is the last one on the list and
>> list_next_entry() doesn't test for that, e.g. it never return NULL.
> Urgh, right. For some reason I was thinking I would get the same
> sched_job in that case again, thus the "next != s_job" in the condition
> below. But for sure the next pointer is to the list head in the
> scheduler, must be the temperature in here causing bitflips in my
> brain. ;)
>
> Updated patch coming up in few minutes.
>
>> BTW: There are also quite a lot of other things we could optimize here.
>> So if you need something todo, just ping me :)
> Hehe, I think we are all in the same boat here: too much stuff on the
> plate.

Yeah, of course :)

Well what I wanted to note is that you could turn the list into a ring 
buffer, that might make the whole handling a lot easier.

Cheers,
Christian.

>
> Regards,
> Lucas
>
>> Cheers,
>> Christian.
>>
>> Am 03.08.2018 um 15:18 schrieb Lucas Stach:
>>> drm_sched_job_finish() is a work item scheduled for each finished job on
>>> a unbound system workqueue. This means the workers can execute out of order
>>> with regard to the real hardware job completions.
>>>
>>> If this happens queueing a timeout worker for the first job on the ring
>>> mirror list is wrong, as this may be a job which has already finished
>>> executing. Fix this by reorganizing the code to always queue the worker
>>> for the next job on the list, if this job hasn't finished yet. This is
>>> robust against a potential reordering of the finish workers.
>>>
>>> Also move out the timeout worker cancelling, so that we don't need to
>>> take the job list lock twice. As a small optimization list_del is used
>>> to remove the job from the ring mirror list, as there is no need to
>>> reinit the list head in the job we are about to free.
>>>
>>>>> Signed-off-by: Lucas Stach <l.stach at pengutronix.de>
>>> ---
>>>  Â  drivers/gpu/drm/scheduler/gpu_scheduler.c | 20 +++++++++-----------
>>>  Â  1 file changed, 9 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>> index 44d480768dfe..0be2859d7b80 100644
>>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>> @@ -452,24 +452,22 @@ static void drm_sched_job_finish(struct work_struct *work)
>>>>>  Â Â 						Â Â Â finish_work);
>>>>>  Â Â 	struct drm_gpu_scheduler *sched = s_job->sched;
>>>    
>>>>> -	/* remove job from ring_mirror_list */
>>>>> +	if (sched->timeout != MAX_SCHEDULE_TIMEOUT)
>>>>> +		cancel_delayed_work_sync(&s_job->work_tdr);
>>> +
>>>>>  Â Â 	spin_lock(&sched->job_list_lock);
>>>>> -	list_del_init(&s_job->node);
>>>>>  Â Â 	if (sched->timeout != MAX_SCHEDULE_TIMEOUT) {
>>>>> -		struct drm_sched_job *next;
>>> -
>>>>> -		spin_unlock(&sched->job_list_lock);
>>>>> -		cancel_delayed_work_sync(&s_job->work_tdr);
>>>>> -		spin_lock(&sched->job_list_lock);
>>>>> +		struct drm_sched_job *next = list_next_entry(s_job, node);
>>>    
>>>>>  Â Â 		/* queue TDR for next job */
>>>>> -		next = list_first_entry_or_null(&sched->ring_mirror_list,
>>>>> -						struct drm_sched_job, node);
>>> -
>>>>> -		if (next)
>>>>> +		if (next != s_job &&
>>>>> +		Â Â Â Â !dma_fence_is_signaled(&s_job->s_fence->finished))
>>>>>  Â Â 			schedule_delayed_work(&next->work_tdr, sched->timeout);
>>>>>  Â Â 	}
>>>>> +	/* remove job from ring_mirror_list */
>>>>> +	list_del(&s_job->node);
>>>>>  Â Â 	spin_unlock(&sched->job_list_lock);
>>> +
>>>>>  Â Â 	dma_fence_put(&s_job->s_fence->finished);
>>>>>  Â Â 	sched->ops->free_job(s_job);
>>>  Â  }
>>