On 23 Dec 2024, at 4:55, Rik van Riel <riel@xxxxxxxxxxx> wrote: > +static int mm_active_cpus(struct mm_struct *mm) > +{ > + int count = 0; > + int cpu; > + > + for_each_cpu(cpu, mm_cpumask(mm)) { > + /* Skip the CPUs that aren't really running this process. */ > + if (per_cpu(cpu_tlbstate.loaded_mm, cpu) != mm) > + continue; > + > + if (per_cpu(cpu_tlbstate_shared.is_lazy, cpu)) > + continue; > + > + count++; > + } > + return count; > +} Since you are only interested in checking whether the number of “mm active CPUs" is greater than a certain threshold, don’t you want to add some checks for early termination? This can allow to avoid cachelines of cpu_tlbstate traversing back and forth. For instance, by running cpumask_weight() first, if the weight is lower than the threshold, no need to loop. Similarly, if inside the loop the threshold has already been crossed, no need for more iterations.