On Fri, Sep 13, 2024 at 01:23:01PM -0700, Rob Clark wrote: > From: Rob Clark <robdclark@xxxxxxxxxxxx> > > Fixes a race condition reported here: https://github.com/AsahiLinux/linux/issues/309#issuecomment-2238968609 > > The whole premise of lockless access to a single-producer-single- > consumer queue is that there is just a single producer and single > consumer. That means we can't call drm_sched_can_queue() (which is > about queueing more work to the hw, not to the spsc queue) from > anywhere other than the consumer (wq). > > This call in the producer is just an optimization to avoid scheduling > the consuming worker if it cannot yet queue more work to the hw. It > is safe to drop this optimization to avoid the race condition. > > Suggested-by: Asahi Lina <lina@xxxxxxxxxxxxx> > Fixes: a78422e9dff3 ("drm/sched: implement dynamic job-flow control") > Closes: https://github.com/AsahiLinux/linux/issues/309 > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Rob Clark <robdclark@xxxxxxxxxxxx> > --- > drivers/gpu/drm/scheduler/sched_entity.c | 4 ++-- > drivers/gpu/drm/scheduler/sched_main.c | 7 ++----- > include/drm/gpu_scheduler.h | 2 +- > 3 files changed, 5 insertions(+), 8 deletions(-) Tested for several hours with CONFIG_PREMPT=y and kasan with a similar workload as in the github issue without reports or oopses. Feel free to add Tested-by: Janne Grunau <j@xxxxxxxxxx> thanks, Janne