Hello Hillf, Thank you for looking into this patch. On 9/11/2022 1:35 PM, Hillf Danton wrote: > On 10 Sep 2022 16:23:26 +0530 K Prateek Nayak <kprateek.nayak@xxxxxxx> wrote: >> - Load balancing considerations >> >> If we have more tasks than the CPUs in the MC Domain, ignore the hint >> set by the user. This prevents losing the consolidation done at the >> wakeup time. > > It is waste of time to cure ten pains with a pill in five days a week. This patch mainly tries to stop tasks with a wakeup hint being pulled apart by the load-balancer and then end up moving back again to the same LLC during subsequent wakeup leading to ping-ponging and lot of wasted migrations. This is not a complete solution in any form and was a stop-gap for this experiment. I bet there are better alternatives to handle hints in the load-balancing path. I'm all ears any suggestions from the community :) > >> @@ -7977,6 +7980,21 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env) >> return 0; >> } >> >> + /* >> + * Hints are followed only if the MC Domain is still ideal >> + * for the task. >> + */ >> + if (!env->ignore_hint) { >> + /* >> + * Only consider the hints from the wakeup path to maintain >> + * data locality. >> + */ >> + if (READ_ONCE(p->hint) & >> + (PR_SCHED_HINT_WAKE_AFFINE | PR_SCHED_HINT_WAKE_HOLD)) >> + return 0; >> + } > > The wake hints are not honored during lb without PR_SCHED_HINT_IGNORE_LB set > then the scheduler works as you hint. Are you suggesting we leave it to the user, to control whether the load-balancer can spread the task apart, even if hints are set, via another userspace hint "PR_SCHED_HINT_IGNORE_LB"? I had not considered it before but it may benefit some workloads. Again, this API is not the final API in any form but we can have a knob as you suggested that can be set for a class of workloads which may benefit from this behavior. > > Hillf -- Thanks and Regards, Prateek