On Mon, 2024-12-30 at 16:52 +0000, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxx> > > <tldr> > Replacing FIFO with a flavour of deadline driven scheduling and > removing round- > robin. Connecting the scheduler with dma-fence deadlines. First draft > and > testing by different drivers and feedback would be nice. I was only > able to test > it with amdgpu. Other drivers may not even compile. (on my machine I could build them all) > </tldr> > > If I remember correctly Christian mentioned recently (give or take) > that maybe > round-robin could be removed. That got me thinking how and what could > be > improved and simplified. So I played a bit in the scheduler code and > came up > with something which appears to not crash at least. Whether or not > there are > significant advantages apart from maybe code consolidation and > reduction is the > main thing to be determined. Hi, so first of all: Happy new year and thx for putting in all that effort :) I gave the series a first look; Since this is an RFC, let's abstain from doing deeper reviews of the exact code for now. > > One big question is whether round-robin can really be removed. Does > anyone use > it, rely on it, or what are even use cases where it is much better > than FIFO. So AFAICS Round Robin is not used anymore by anyone. And my understanding indeed is, too, that there is not really any use-case where one would like anything except for FIFO. Looking at 977d97f18b5b ("drm/scheduler: Set the FIFO scheduling policy as the default"), it seems to me that RR just was easy to implement and it had the disadvantage of systems under high load cause the oldest job to be starved to death, which is why FIFO was introduced. So my guess would be that RR just is a relict. If we agree on that, then we could remove RR in any case, and the subsequent question would be whether FIFO should be replaced with deadline (or: if there should be FIFO *and* deadline?), wouldn't it? > > See "drm/sched: Add deadline policy" commit message for a short > description on > what flavour of deadline scheduling it is. But in essence it should a > more fair > FIFO where higher priority can not forever starve lower priorities. See my answer on that patch. As you can imagine I'm wondering if that "better FIFO" would be worth it considering that we are running into a certain risk of regressing stuff through this rework. > > "drm/sched: Connect with dma-fence deadlines" wires up dma-fence > deadlines to > the scheduler because it is easy and makes logical sense with this. > And I > noticed userspace already uses it so why not wire it up fully. Userspace uses the dma-fence deadlines you mean? Do you have a pointer for us? > > Otherwise the series is a bit of progression from consolidating RR > into FIFO > code paths and going from there to deadline and then to a change in > how > dependencies are handled. And code simplification to 1:1 run queue to > scheduler > relationship, because deadline does not need per priority run queues. > > There is quite a bit of code to go throught here so I think it could > be even > better if other drivers could give it a spin as is and see if some > improvements > can be detected. Or at least no regressions. I hope I can dive deeper into the Nouveau side soon. > > Cc: Christian König <christian.koenig@xxxxxxx> > Cc: Danilo Krummrich <dakr@xxxxxxxxxx> > Cc: Matthew Brost <matthew.brost@xxxxxxxxx> > Cc: Philipp Stanner <pstanner@xxxxxxxxxx> > > Tvrtko Ursulin (14): > drm/sched: Delete unused update_job_credits > drm/sched: Remove idle entity from tree > drm/sched: Implement RR via FIFO > drm/sched: Consolidate entity run queue management > drm/sched: Move run queue related code into a separate file > drm/sched: Ignore own fence earlier > drm/sched: Resolve same scheduler dependencies earlier > drm/sched: Add deadline policy > drm/sched: Remove FIFO and RR and simplify to a single run queue > drm/sched: Queue all free credits in one worker invocation > drm/sched: Connect with dma-fence deadlines > drm/sched: Embed run queue singleton into the scheduler > dma-fence: Add helper for custom fence context when merging fences > drm/sched: Resolve all job dependencies in one go It seems to me that this series is a "port RR and FIFO to deadline"- series with some additional patches that could be branched out and be reviewed through a separate series? Patch 1 ("Delete unused...") for example. Other candidates are Patch 5 ("Move run queue related..."), 13 ("Add helper for..."). A few patches might be mergeable even if the main idea with deadline doesn't get approved, that's why I'm suggesting that. Philipp > > drivers/dma-buf/dma-fence-unwrap.c | 8 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 6 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 27 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 5 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 8 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 8 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c | 8 +- > drivers/gpu/drm/scheduler/Makefile | 2 +- > drivers/gpu/drm/scheduler/sched_entity.c | 316 ++++++----- > drivers/gpu/drm/scheduler/sched_fence.c | 5 +- > drivers/gpu/drm/scheduler/sched_main.c | 587 +++++------------- > -- > drivers/gpu/drm/scheduler/sched_rq.c | 199 +++++++ > include/drm/gpu_scheduler.h | 74 ++- > include/linux/dma-fence-unwrap.h | 31 +- > 14 files changed, 606 insertions(+), 678 deletions(-) > create mode 100644 drivers/gpu/drm/scheduler/sched_rq.c >