Re: [PATCH 3/6] sched_ext: idle: Introduce the concept of allowed CPUs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 10, 2025 at 06:07:21AM -1000, Tejun Heo wrote:
> Hello,
> 
> On Sun, Mar 09, 2025 at 04:39:40PM +0100, Andrea Righi wrote:
> > > Would just using a pre-allocated cpumask to do pre-and on @cpus_allowed
> > > work? This won't only be used for topology support (e.g. soft partitioning
> > > in scx_layered and scx_mitosis may want to use multi-topology-unit spanning
> > > subsets) and I'm not sure assuming and optimizing for that is a good idea
> > > for generic API.
> > 
> > We can pre-allocate two additional (per-cpu) cpumasks to do:
> >  - cpumask_and(numa_cpus, numa_span(cpu), cpus_allowed)
> >  - cpumask_and(llc_cpus, llc_span(cpu), cpus_allowed)
> > 
> > And update/use them only when it's needed. In this way the API would be
> > generic without making any implicit assumption about @cpus_allowed.
> 
> I'm not quite following why two masks would be necessary. The user is
> providing two masks and and'ing those two masks result in a single
> cpus_allowed mask which can then be passed down to the existing pick
> functions, no?

When you say the user is providing two masks, you mean p->cpus_ptr
and @cpus_allowed, right? Or am I missing something?

So, internally we have three levels of cpumasks, used in this order:
 1) p->cpus_ptr & cpus_allowed & llc_span(prev_cpu)
 2) p->cpus_ptr & cpus_allowed & numa_span(prev_cpu)
 3) p->cpus_ptr & cpus_allowed

The current logic (without @cpus_allowed) is applying LLC and NUMA
optimization only for tasks that can run on all CPUs (p->cpus_ptr == all),
to avoid doing extra "and" operations internally and simply use
llc_span(prev_cpu) and numa_span(prev_cpu).

With @cpus_allowed this optimization doesn't work anymore and we can't
just re-apply the current logic to "p->cpus_ptr & cpus_allowed", since it
would result in ignoring the LLC and NUMA cpumasks.

Maybe we could use a single pre-allocated temporary cpumask and do the
"and" at each step when it's needed, instead of using two separate cpumasks
to evaluate "cpus_allowed & llc_span(prev_cpu)" and "cpus_allowed &
numa_span(prev_cpu). Is this what you mean?

Thanks,
-Andrea




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux