Re: [PATCH] drm/sched: Only start TDR in drm_sched_job_begin on first job

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 25, 2024 at 02:50:54PM +0000, Matthew Brost wrote:
> On Thu, Jul 25, 2024 at 09:42:08AM +0200, Christian König wrote:
> > Am 25.07.24 um 01:44 schrieb Matthew Brost:
> > > Only start in drm_sched_job_begin on first job being added to the
> > > pending list as if pending list non-empty the TDR has already been
> > > started. It is problematic to restart the TDR as it will extend TDR
> > > period for an already running job, potentially leading to dma-fence
> > > signaling for a very long period of with continous stream of jobs.
> > 
> > Mhm, that should be unnecessary. drm_sched_start_timeout() should only start
> > the timeout, but never re-start it.
> > 
> 
> That function checks the pending list for not empty, so it indeed starts
> it. Which is the correct behavior for some of the callers, e.g.
> drm_sched_tdr_queue_imm, drm_sched_get_finished_job
> 
> IMO best to fix this here.
> 
> Also FWIW on Xe I wrote a test which submitted a new ending spinner,
> then submitted a job every second on the same queue in a loop and
> observed the spinner not get canceled for a long time. After this patch,
> the spinner correctly timed out after 5 second (our default TDR period).
> 
> Matt

Ping Christian. Any response to above?

Pretty clear problem, would like to resolve.

Matt

> 
> > Could be that this isn't working properly.
> > 
> > Regards,
> > Christian.
> > 
> > > 
> > > Cc: Christian König <christian.koenig@xxxxxxx>
> > > Signed-off-by: Matthew Brost <matthew.brost@xxxxxxxxx>
> > > ---
> > >   drivers/gpu/drm/scheduler/sched_main.c | 3 ++-
> > >   1 file changed, 2 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > > index 7e90c9f95611..feeeb9dbeb86 100644
> > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > @@ -540,7 +540,8 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
> > >   	spin_lock(&sched->job_list_lock);
> > >   	list_add_tail(&s_job->list, &sched->pending_list);
> > > -	drm_sched_start_timeout(sched);
> > > +	if (list_is_singular(&sched->pending_list))
> > > +		drm_sched_start_timeout(sched);
> > >   	spin_unlock(&sched->job_list_lock);
> > >   }
> > 



[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux