The housekeeping CPU masks, set up by the "isolcpus" and "nohz_full" boot command line options, are used at boot time to exclude selected CPUs from running some kernel housekeeping subsystems to minimize disturbance to latency sensitive userspace applications such as DPDK. This options can only be changed with a reboot. This is a problem for containerized workloads running on OpenShift/Kubernetes where a mix of low latency and "normal" workloads can be created/destroyed dynamically and the number of CPUs allocated to each workload is often not known at boot time. Theoretically, complete CPU offlining/onlining could be used for housekeeping adjustments, but this approach is not practical. Telco companies use Linux to run DPDK in OpenShift/Kubernetes containers. DPDK requires isolated cpus to run real-time processes. Kubernetes manages allocation of resources for containers. Unfortunately Kubernetes doesn't support dynamic CPU offlining/onlining: https://github.com/kubernetes/kubernetes/issues/67500 and is not planning to support it. Addressing this issue at the application level appears to be even less straightforward than addressing it at the kernel level. This series of patches is based on series isolation: Exclude dynamically isolated CPUs from housekeeping masks: https://lore.kernel.org/lkml/20240821142312.236970-1-longman@xxxxxxxxxx/ Its purpose is to exclude dynamically isolated CPUs from some housekeeping masks so that subsystems that check the housekeeping masks at run time will not use those isolated CPUs. However, some of subsystems can use obsolete housekeeping CPU masks. Therefore, to prevent the use of these isolated CPUs, it is necessary to explicitly propagate changes of the housekeeping masks to all subsystems depending on the mask. Signed-off-by: Costa Shulyupin <costa.shul@xxxxxxxxxx> --- Changes in v3: - Address the comments by Thomas Gleixner. Changes in v2: - Focus in this patch series on managed interrupts only. - https://lore.kernel.org/lkml/20240916122044.3056787-1-costa.shul@xxxxxxxxxx/ Changes in v1: - https://lore.kernel.org/lkml/20240516190437.3545310-1-costa.shul@xxxxxxxxxx/ References: - Linux Kernel Dynamic CPU Isolation: https://pretalx.com/devconf-us-2024/talk/AZBQLE/ Costa Shulyupin (3): sched/isolation: Add infrastructure for dynamic CPU isolation DO NOT MERGE: test for managed irqs adjustment genirq/cpuhotplug: Adjust managed irqs according to change of housekeeping CPU block/blk-mq.c | 19 +++++++ include/linux/blk-mq.h | 2 + include/linux/cpu.h | 4 ++ include/linux/irq.h | 2 + kernel/cgroup/cpuset.c | 1 + kernel/cpu.c | 2 +- kernel/irq/cpuhotplug.c | 99 +++++++++++++++++++++++++++++++++ kernel/sched/isolation.c | 51 +++++++++++++++-- tests/managed_irq.sh | 116 +++++++++++++++++++++++++++++++++++++++ 9 files changed, 291 insertions(+), 5 deletions(-) create mode 100755 tests/managed_irq.sh -- 2.47.0