Hi Christian,
On 1/14/20 5:01 PM, Christian König wrote:
Before this patch:
sched_name num of many times it got scheduled
========= ==================================
sdma0 314
sdma1 32
comp_1.0.0 56
comp_1.1.0 0
comp_1.1.1 0
comp_1.2.0 0
comp_1.2.1 0
comp_1.3.0 0
comp_1.3.1 0
After this patch:
sched_name num of many times it got scheduled
========= ==================================
sdma1 243
sdma0 164
comp_1.0.1 14
comp_1.1.0 11
comp_1.1.1 10
comp_1.2.0 15
comp_1.2.1 14
comp_1.3.0 10
comp_1.3.1 10
Well that is still rather nice to have, why does that happen?
I think I know why it happens. At init all entity's rq gets assigned to
sched_list[0]. I put some prints to check what we compare in
drm_sched_entity_get_free_sched.
It turns out most of the time it compares zero values(num_jobs(0) <
min_jobs(0)) so most of the time 1st rq(sdma0, comp_1.0.0) was picked by
drm_sched_entity_get_free_sched.
This patch was not correct , had an extra atomic_inc(num_jobs) in
drm_sched_job_init. This probably added bit of randomness I think, which
helped in better job distribution.
I've updated my previous RFC patch which uses time consumed by each
sched for load balance with a twist of ignoring previously scheduled
sched/rq. Let me know what do you think.
Regards,
Nirmoy
Christian.
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx