On Thu, Feb 27, 2020 at 05:01:34PM -0800, Josh Don wrote: > From: Paul Turner <pjt@xxxxxxxxxx> > > Currently, when updating the affinity of tasks via either cpusets.cpus, > or, sched_setaffinity(); tasks not currently running within the newly > specified CPU will be arbitrarily assigned to the first CPU within the > mask. > > This (particularly in the case that we are restricting masks) can > result in many tasks being assigned to the first CPUs of their new > masks. > > This: > 1) Can induce scheduling delays while the load-balancer has a chance to > spread them between their new CPUs. > 2) Can antogonize a poor load-balancer behavior where it has a > difficult time recognizing that a cross-socket imbalance has been > forced by an affinity mask. > > With this change, tasks are distributed ~evenly across the new mask. We > may intentionally move tasks already running on a CPU within the mask to > avoid edge cases in which a CPU is already overloaded (or would be > assigned to more times than is desired). > > We specifically apply this behavior to the following cases: > - modifying cpuset.cpus > - when tasks join a cpuset > - when modifying a task's affinity via sched_setaffinity(2) Looks fine to me. Peter, what do you think? Thanks. -- tejun