Please comment on the v1 of this patch, which I sent right after. Regards, Luben On 2022-10-21 21:09, Luben Tuikov wrote: > The currently default Round-Robin GPU scheduling can result in starvation > of entities which have a large number of jobs, over entities which have > a very small number of jobs (single digit). > > This can be illustrated in the following diagram, where jobs are > alphabetized to show their chronological order of arrival, where job A is > the oldest, B is the second oldest, and so on, to J, the most recent job to > arrive. > > ---> entities > j | H-F-----A--E--I-- > o | --G-----B-----J-- > b | --------C-------- > s\/ --------D-------- > > WLOG, asuming all jobs are "ready", then a R-R scheduling will execute them > in the following order (a slice off of the top of the entities' list), > > H, F, A, E, I, G, B, J, C, D. > > However, to mitigate job starvation, we'd rather execute C and D before E, > and so on, given, of course, that they're all ready to be executed. > > So, if all jobs are ready at this instant, the order of execution for this > and the next 9 instances of picking the next job to execute, should really > be, > > A, B, C, D, E, F, G, H, I, J, > > which is their chronological order. The only reason for this order to be > broken, is if an older job is not yet ready, but a younger job is ready, at > an instant of picking a new job to execute. For instance if job C wasn't > ready at time 2, but job D was ready, then we'd pick job D, like this: > > 0 +1 +2 ... > A, B, D, ... > > And from then on, C would be preferred before all other jobs, if it is ready > at the time when a new job for execution is picked. So, if C became ready > two steps later, the execution order would look like this: > > ......0 +1 +2 ... > A, B, D, E, C, F, G, H, I, J > > This is what the FIFO GPU scheduling algorithm achieves. It uses a > Red-Black tree to keep jobs sorted in chronological order, where picking > the oldest job is O(1) (we use the "cached" structure), and balancing the > tree is O(log n). IOW, it picks the *oldest ready* job to execute now. > > The implemntation is already in the kernel, and this commit only changes > the default GPU scheduling algorithm to use. > > This was tested and achieves about 1% faster performance over the Round > Robin algorithm. > > Cc: Christian König <christian.koenig@xxxxxxx> > Cc: Alex Deucher <Alexander.Deucher@xxxxxxx> > Cc: Direct Rendering Infrastructure - Development <dri-devel@xxxxxxxxxxxxxxxxxxxxx> > Signed-off-by: Luben Tuikov <luben.tuikov@xxxxxxx> > --- > drivers/gpu/drm/scheduler/sched_main.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c > index 2fab218d708279..d0ff9e11cb69fa 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -62,13 +62,13 @@ > #define to_drm_sched_job(sched_job) \ > container_of((sched_job), struct drm_sched_job, queue_node) > > -int drm_sched_policy = DRM_SCHED_POLICY_RR; > +int drm_sched_policy = DRM_SCHED_POLICY_FIFO; > > /** > * DOC: sched_policy (int) > * Used to override default entities scheduling policy in a run queue. > */ > -MODULE_PARM_DESC(sched_policy, "Specify schedule policy for entities on a runqueue, " __stringify(DRM_SCHED_POLICY_RR) " = Round Robin (default), " __stringify(DRM_SCHED_POLICY_FIFO) " = use FIFO."); > +MODULE_PARM_DESC(sched_policy, "Specify the scheduling policy for entities on a run-queue, " __stringify(DRM_SCHED_POLICY_RR) " = Round Robin, " __stringify(DRM_SCHED_POLICY_FIFO) " = FIFO (default)."); > module_param_named(sched_policy, drm_sched_policy, int, 0444); > > static __always_inline bool drm_sched_entity_compare_before(struct rb_node *a, > > base-commit: 16d2a3f2ad1d2b95bf9122c910c63b0efe74179d