Re: [PATCH 1/6] sched_ext: idle: Extend topology optimizations to all tasks

Andrea Righi <arighi@xxxxxxxxxx> · Tue, 18 Mar 2025 08:31:29 +0100

On Mon, Mar 17, 2025 at 08:22:35AM -1000, Tejun Heo wrote:
...
> > +	/*
> > +	 * If the task is allowed to run on all CPUs, simply use the
> > +	 * architecture's cpumask directly. Otherwise, compute the
> > +	 * intersection of the architecture's cpumask and the task's
> > +	 * allowed cpumask.
> > +	 */
> > +	if (!cpus || p->nr_cpus_allowed >= num_possible_cpus() ||
> > +	    cpumask_subset(cpus, p->cpus_ptr))
> > +		return cpus;
> > +
> > +	if (!cpumask_equal(cpus, p->cpus_ptr) &&
> 
> Hmm... isn't this covered by the preceding cpumask_subset() test? Here, cpus
> is not a subset of p->cpus_ptr, so how can it be the same as p->cpus_ptr?
> 
> > +	    cpumask_and(local_cpus, cpus, p->cpus_ptr))
> > +		return local_cpus;
> > +
> > +	return NULL;

Also, I'm also wondering if there's really a benefit checking for
cpumask_subset() and then doing cpumask_and() only when it's needed, or if
we should just do cpumask_and(). It's true that we can save some writes,
but they're done on a temporary local per-CPU cpumask, so they shouldn't
introduce cache contention.

-Andrea