On Fri, Sep 13, 2024 at 09:53:25AM -0700, Rob Clark wrote: > From: Rob Clark <robdclark@xxxxxxxxxxxx> > > Fixes a race condition reported here: https://github.com/AsahiLinux/linux/issues/309#issuecomment-2238968609 Good catch! Please add a 'Closes' tag with this link. > > The whole premise of lockless access to a single-producer-single- > consumer queue is that there is just a single producer and single > consumer. That means we can't call drm_sched_can_queue() (which is > about queueing more work to the hw, not to the spsc queue) from > anywhere other than the consumer (wq). > > This call in the producer is just an optimization to avoid scheduling > the consuming worker if it cannot yet queue more work to the hw. It > is safe to drop this optimization to avoid the race condition. > > Suggested-by: Asahi Lina <lina@xxxxxxxxxxxxx> > Fixes: a78422e9dff3 ("drm/sched: implement dynamic job-flow control") You may want to explicitly CC stable. > Signed-off-by: Rob Clark <robdclark@xxxxxxxxxxxx> > --- > drivers/gpu/drm/scheduler/sched_main.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c > index ab53ab486fe6..1af1dbe757d5 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -1020,8 +1020,7 @@ EXPORT_SYMBOL(drm_sched_job_cleanup); > void drm_sched_wakeup(struct drm_gpu_scheduler *sched, > struct drm_sched_entity *entity) Please also remove the entity parameter. For the other refactoring, I agree it should be in a different patch. With that, Reviewed-by: Danilo Krummrich <dakr@xxxxxxxxxx> Thanks for fixing this. - Danilo > { > - if (drm_sched_can_queue(sched, entity)) > - drm_sched_run_job_queue(sched); > + drm_sched_run_job_queue(sched); > } > > /** > -- > 2.46.0 >