On Wed, Jan 27 2021 at 10:09, Marcelo Tosatti wrote: > On Wed, Jan 27, 2021 at 12:36:30PM +0000, Robin Murphy wrote: >> > > > /** >> > > > * cpumask_next - get the next cpu in a cpumask >> > > > @@ -205,22 +206,27 @@ void __init free_bootmem_cpumask_var(cpumask_var_t mask) >> > > > */ >> > > > unsigned int cpumask_local_spread(unsigned int i, int node) >> > > > { >> > > > - int cpu; >> > > > + int cpu, hk_flags; >> > > > + const struct cpumask *mask; >> > > > + hk_flags = HK_FLAG_DOMAIN | HK_FLAG_MANAGED_IRQ; >> > > > + mask = housekeeping_cpumask(hk_flags); >> > > >> > > AFAICS, this generally resolves to something based on cpu_possible_mask >> > > rather than cpu_online_mask as before, so could now potentially return an >> > > offline CPU. Was that an intentional change? >> > >> > Robin, >> > >> > AFAICS online CPUs should be filtered. >> >> Apologies if I'm being thick, but can you explain how? In the case of >> isolation being disabled or compiled out, housekeeping_cpumask() is >> literally just "return cpu_possible_mask;". If we then iterate over that >> with for_each_cpu() and just return the i'th possible CPU (e.g. in the >> NUMA_NO_NODE case), what guarantees that CPU is actually online? >> >> Robin. > > Nothing, but that was the situation before 1abdfe706a579a702799fce465bceb9fb01d407c > as well. > > cpumask_local_spread() should probably be disabling CPU hotplug. It can't unless all callers are from preemtible code. Aside of that this whole frenzy to sprinkle housekeeping_cpumask() all over the kernel is just wrong, really. As I explained several times before there are very valid reasons for having queues and interrupts on isolated CPUs. Just optimizing for the usecases some people care about is not making anything better. Thanks, tglx