Hi, On Fri, 2024-09-06 at 19:06 +0100, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxx> > > In a recent conversation with Christian there was a thought that > drm_sched_entity_modify_sched() should start using the entity- > >rq_lock to be > safe against job submission and simultaneous priority changes. There are also FIXMEs in gpu_scheduler.h that might be related. > > The kerneldoc accompanying that function however is a bit unclear to > me. For > instance is amdgpu simply doing it wrongly by not serializing the two > in the > driver? Or is the comment referring to some other race condition than > which is > of concern in this series? > > To cut the long story short, first three patches try to fix this race > in three > places I *think* can manifest in different ways. > > Last patch is a trivial optimisation I spotted can be easily done. I took a look and at least to me it doesn't appear to be that trivial, mostly because it takes two locks. Would you mind branching that out as a separate patch so that the series would 100% address bugs? P. > > Cc: Christian König <christian.koenig@xxxxxxx> > Cc: Alex Deucher <alexander.deucher@xxxxxxx> > Cc: Luben Tuikov <ltuikov89@xxxxxxxxx> > Cc: Matthew Brost <matthew.brost@xxxxxxxxx> > > Tvrtko Ursulin (4): > drm/sched: Add locking to drm_sched_entity_modify_sched > drm/sched: Always wake up correct scheduler in > drm_sched_entity_push_job > drm/sched: Always increment correct scheduler score > drm/sched: Optimise drm_sched_entity_push_job > > drivers/gpu/drm/scheduler/sched_entity.c | 17 ++++++++++++----- > drivers/gpu/drm/scheduler/sched_main.c | 21 ++++++++++++++------- > include/drm/gpu_scheduler.h | 1 + > 3 files changed, 27 insertions(+), 12 deletions(-) >