Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

Christian König <christian.koenig@xxxxxxx> · Fri, 27 Aug 2021 20:30:32 +0200

Yeah, that's what I meant with that the start of processing a job is a 
bit swampy defined.

Jobs overload, but we simply don't have another good indicator that a 
job started except that the previous one completed.

It's still better than starting the timer when pushing the job to the 
ring buffer, because that is completely off.

Christian.

Am 27.08.21 um 20:22 schrieb Andrey Grodzovsky:
As I mentioned to Monk before - what about cases such as in this test 
- 
https://gitlab.freedesktop.org/mesa/drm/-/commit/bc21168fa924d3fc4a000492e861f50a1a135b25 

Here you don't have serialized sequence where when jobs finishes 
processing and second starts, they execute together  concurrently - 
for those cases seems
to me restarting the timer for the second job from scratch will let it 
hang much longer then allowed by TO value.

Andrey

On 2021-08-27 10:29 a.m., Christian König wrote:
I don't think that makes sense.

See we don't want to start the time when the job is inserted into the 
ring buffer, but rather when it starts processing.

Starting processing is a bit swampy defined, but just starting the 
timer when the previous job completes should be fine enough.

Christian.

Am 27.08.21 um 15:57 schrieb Andrey Grodzovsky:
The TS represents the point in time when the job was inserted into 
the pending list.
I don't think it matters when it actually starts to be processed, 
what matters is when this job was inserted into pending list because 
right at that point you arm the TO timer (when no other is running 
already)
and so when the previous job completes and you cancel and rearm 
again you can use that TS from the next job in pending list to 
calculate how much time has actually left for it to run before TDR 
must be initiated
and not just give it again full TO value to run even if it has 
already been running for a while.

Also, i am not sure also about the assumption that what we measure 
is processing by HW, what we measure is from the moment it was 
scheduled to ring to the moment the job completed (EOP event). At 
least that what our TDR timer measures and so it makes sense to set 
the TS at this point.

Andrey

On 2021-08-27 3:20 a.m., Liu, Monk wrote:
[AMD Official Use Only]

what is that 'ts' representing for ? it looks to me the jiffies 
that it get scheduled to the ring,  but a job scheduled to the ring 
doesn't represent it's being processed by hw.

Thanks

------------------------------------------
Monk Liu | Cloud-GPU Core team
------------------------------------------

-----Original Message-----
From: Grodzovsky, Andrey <Andrey.Grodzovsky@xxxxxxx>
Sent: Friday, August 27, 2021 4:14 AM
To: Liu, Monk <Monk.Liu@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx; 
Koenig, Christian <Christian.Koenig@xxxxxxx>
Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx
Subject: Re: [PATCH] drm/sched: fix the bug of time out 
calculation(v3)

Attached quick patch for per job TTL calculation to make more 
precises next timer expiration. It's on top of the patch in this 
thread. Let me know if this makes sense.

Andrey

On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote:
On 2021-08-26 12:55 a.m., Monk Liu wrote:
issue:
in cleanup_job the cancle_delayed_work will cancel a TO timer even
the its corresponding job is still running.

fix:
do not cancel the timer in cleanup_job, instead do the cancelling
only when the heading job is signaled, and if there is a "next" job
we start_timeout again.

v2:
further cleanup the logic, and do the TDR timer cancelling if the
signaled job is the last one in its scheduler.

v3:
change the issue description
remove the cancel_delayed_work in the begining of the cleanup_job
recover the implement of drm_sched_job_begin.

TODO:
1)introduce pause/resume scheduler in job_timeout to serial the
handling of scheduler and job_timeout.
2)drop the bad job's del and insert in scheduler due to above
serialization (no race issue anymore with the serialization)

Signed-off-by: Monk Liu <Monk.Liu@xxxxxxx>
---
   drivers/gpu/drm/scheduler/sched_main.c | 25
++++++++++---------------
   1 file changed, 10 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c
b/drivers/gpu/drm/scheduler/sched_main.c
index a2a9536..ecf8140 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct
drm_gpu_scheduler *sched)
   {
       struct drm_sched_job *job, *next;
   -    /*
-     * Don't destroy jobs while the timeout worker is running OR
thread
-     * is being parked and hence assumed to not touch pending_list
-     */
-    if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
-        !cancel_delayed_work(&sched->work_tdr)) ||
-        kthread_should_park())
+    if (kthread_should_park())
           return NULL;

I actually don't see why we need to keep the above, on the other side
(in drm_sched_stop) we won't touch the pending list anyway until 
sched
thread came to full stop (kthread_park). If you do see a reason why
this needed then a comment should be here i think.

Andrey


spin_lock(&sched->job_list_lock);
@@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct
drm_gpu_scheduler *sched)
       if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
           /* remove job from pending_list */
           list_del_init(&job->list);
+
+        /* cancel this job's TO timer */
+        cancel_delayed_work(&sched->work_tdr);
           /* make the scheduled timestamp more accurate */
           next = list_first_entry_or_null(&sched->pending_list,
                           typeof(*next), list);
-        if (next)
+
+        if (next) {
               next->s_fence->scheduled.timestamp =
job->s_fence->finished.timestamp;
-
+            /* start TO timer for next job */
+            drm_sched_start_timeout(sched);
+        }
       } else {
           job = NULL;
-        /* queue timeout for next job */
-        drm_sched_start_timeout(sched);
       }
         spin_unlock(&sched->job_list_lock);
@@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
                         (entity = 
drm_sched_select_entity(sched))) ||
                        kthread_should_stop());
   -        if (cleanup_job) {
+        if (cleanup_job)
               sched->ops->free_job(cleanup_job);
-            /* queue timeout for next job */
-            drm_sched_start_timeout(sched);
-        }
             if (!entity)
               continue;