Re: [PATCH 1/6] sched_ext: idle: Extend topology optimizations to all tasks

Tejun Heo <tj@xxxxxxxxxx> · Tue, 18 Mar 2025 07:31:50 -1000

Hello,

On Tue, Mar 18, 2025 at 08:31:29AM +0100, Andrea Righi wrote:
> On Mon, Mar 17, 2025 at 08:22:35AM -1000, Tejun Heo wrote:
> ...
> > > +	/*
> > > +	 * If the task is allowed to run on all CPUs, simply use the
> > > +	 * architecture's cpumask directly. Otherwise, compute the
> > > +	 * intersection of the architecture's cpumask and the task's
> > > +	 * allowed cpumask.
> > > +	 */
> > > +	if (!cpus || p->nr_cpus_allowed >= num_possible_cpus() ||
> > > +	    cpumask_subset(cpus, p->cpus_ptr))
> > > +		return cpus;
> > > +
> > > +	if (!cpumask_equal(cpus, p->cpus_ptr) &&
> > 
> > Hmm... isn't this covered by the preceding cpumask_subset() test? Here, cpus
> > is not a subset of p->cpus_ptr, so how can it be the same as p->cpus_ptr?
> > 
> > > +	    cpumask_and(local_cpus, cpus, p->cpus_ptr))
> > > +		return local_cpus;
> > > +
> > > +	return NULL;
> 
> Also, I'm also wondering if there's really a benefit checking for
> cpumask_subset() and then doing cpumask_and() only when it's needed, or if
> we should just do cpumask_and(). It's true that we can save some writes,
> but they're done on a temporary local per-CPU cpumask, so they shouldn't
> introduce cache contention.

Yeah, I can imagine it going either way, so no strong preference.

Thanks.

-- 
tejun