On Thu, Jan 28, 2021 at 05:02:41PM +0100, Thomas Gleixner wrote: > On Wed, Jan 27 2021 at 09:19, Marcelo Tosatti wrote: > > On Wed, Jan 27, 2021 at 11:57:16AM +0000, Robin Murphy wrote: > >> > + hk_flags = HK_FLAG_DOMAIN | HK_FLAG_MANAGED_IRQ; > >> > + mask = housekeeping_cpumask(hk_flags); > >> > >> AFAICS, this generally resolves to something based on cpu_possible_mask > >> rather than cpu_online_mask as before, so could now potentially return an > >> offline CPU. Was that an intentional change? > > > > Robin, > > > > AFAICS online CPUs should be filtered. > > The whole pile wants to be reverted. It's simply broken in several ways. I was asking for your comments on interaction with CPU hotplug :-) Anyway... So housekeeping_cpumask has multiple meanings. In this case: HK_FLAG_DOMAIN | HK_FLAG_MANAGED_IRQ domain Isolate from the general SMP balancing and scheduling algorithms. Note that performing domain isolation this way is irreversible: it's not possible to bring back a CPU to the domains once isolated through isolcpus. It's strongly advised to use cpusets instead to disable scheduler load balancing through the "cpuset.sched_load_balance" file. It offers a much more flexible interface where CPUs can move in and out of an isolated set anytime. You can move a process onto or off an "isolated" CPU via the CPU affinity syscalls or cpuset. <cpu number> begins at 0 and the maximum value is "number of CPUs in system - 1". managed_irq Isolate from being targeted by managed interrupts which have an interrupt mask containing isolated CPUs. The affinity of managed interrupts is handled by the kernel and cannot be changed via the /proc/irq/* interfaces. This isolation is best effort and only effective if the automatically assigned interrupt mask of a device queue contains isolated and housekeeping CPUs. If housekeeping CPUs are online then such interrupts are directed to the housekeeping CPU so that IO submitted on the housekeeping CPU cannot disturb the isolated CPU. If a queue's affinity mask contains only isolated CPUs then this parameter has no effect on the interrupt routing decision, though interrupts are only delivered when tasks running on those isolated CPUs submit IO. IO submitted on housekeeping CPUs has no influence on those queues. So as long as the meaning of the flags are respected, seems alright. Nitesh, is there anything preventing this from being fixed in userspace ? (as Thomas suggested previously).