Hi Costa, Le Mon, Jan 13, 2025 at 09:08:54PM +0200, Costa Shulyupin a écrit : > Changing and using housekeeping and isolated CPUs requires reboot. > > The goal is to change CPU isolation dynamically without reboot. > > This patch is based on the parent patch > cgroup/cpuset: Exclude isolated CPUs from housekeeping CPU masks > https://lore.kernel.org/lkml/20240821142312.236970-3-longman@xxxxxxxxxx/ > Its purpose is to update isolation cpumasks. > > However, some subsystems may use outdated housekeeping CPU masks. To > prevent the use of these isolated CPUs, it is essential to explicitly > propagate updates to the housekeeping masks across all subsystems that > depend on them. > > This patch is not intended to be merged and disrupt the kernel. > It is still a proof-of-concept for research purposes. > > The questions are: > - Is this the right direction, or should I explore an alternative approach? > - What factors need to be considered? > - Any suggestions or advice? > - Have similar attempts been made before? Since the kthreads preferred affinity patchset just got merged, I don't think anything needs to be done anymore in the kthreads front to toggle nohz_full through cpusets safely. Let's see the details below: > > Update the affinity of kthreadd and trigger the recalculation of kthread > affinities using kthreads_online_cpu(). > > The argument passed to kthreads_online_cpu() is irrelevant, as the > function reassigns affinities of kthreads based on their > preferred_affinity and the updated housekeeping_cpumask(HK_TYPE_KTHREAD). > > Currently only RCU uses kthread_affine_preferred(). > > I dare to try calling kthread_affine_preferred() from kthread_run() to > set preferred_affinity as cpu_possible_mask for kthreads without a > specific affinity, enabling their management through > kthreads_online_cpu(). > > Any objections? > > For details about kthreads affinity patterns please see: > https://lore.kernel.org/lkml/20241211154035.75565-16-frederic@xxxxxxxxxx/ > > Signed-off-by: Costa Shulyupin <costa.shul@xxxxxxxxxx> > --- > include/linux/kthread.h | 5 ++++- > kernel/cgroup/cpuset.c | 1 + > kernel/kthread.c | 6 ++++++ > 3 files changed, 11 insertions(+), 1 deletion(-) > > diff --git a/include/linux/kthread.h b/include/linux/kthread.h > index 8d27403888ce9..b43c5aeb2cfd7 100644 > --- a/include/linux/kthread.h > +++ b/include/linux/kthread.h > @@ -52,8 +52,10 @@ bool kthread_is_per_cpu(struct task_struct *k); > ({ \ > struct task_struct *__k \ > = kthread_create(threadfn, data, namefmt, ## __VA_ARGS__); \ > - if (!IS_ERR(__k)) \ > + if (!IS_ERR(__k)) { \ > + kthread_affine_preferred(__k, cpu_possible_mask); \ > wake_up_process(__k); \ > + } \ > __k; \ > }) > > @@ -270,4 +272,5 @@ struct cgroup_subsys_state *kthread_blkcg(void); > #else > static inline void kthread_associate_blkcg(struct cgroup_subsys_state *css) { } > #endif > +void kthreads_update_affinity(void); > #endif /* _LINUX_KTHREAD_H */ > diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c > index 65658a5c2ac81..7d71acc7f46b6 100644 > --- a/kernel/cgroup/cpuset.c > +++ b/kernel/cgroup/cpuset.c > @@ -1355,6 +1355,7 @@ static void update_isolation_cpumasks(bool isolcpus_updated) > trl(); > ret = housekeeping_exlude_isolcpus(isolated_cpus, HOUSEKEEPING_FLAGS); > WARN_ON_ONCE((ret < 0) && (ret != -EOPNOTSUPP)); > + kthreads_update_affinity(); A few things to consider: 1) update_isolation_cpumasks() will be called with cpus_read_lock() (cf: sched_partition_write() and cpuset_write_resmask()), therefore kthreads_online_cpu() can't run concurrently. 2) The constraint to turn on/off a CPU as nohz_full will be that the target CPU is offline. So when the target CPU will later become offline, kthreads_online_cpu() will take care of the newly modified housekeeping_mask() (which will be visible because cpus_write_lock() is held) and apply accordingly the appropriate effective affinity. The only thing that might need to be done by update_isolation_cpumasks() is to hold kthreads_hotplug_lock so that a subsequent call to kthread_affine_preferred() sees the "freshest" update to housekeeping_mask(). But even that may not be mandatory because the concerned CPUs are offline and kthreads_online_cpu() will fix the race when they boot. Oh and the special case kthreadd might need a direct affinity update. So the good news is that we have a lot of things sorted out to prepare for that cpuset interface: _ RCU NOCB is ready _ kthreads are ready _ timers are fine since we toggle only offline CPUs and timer_migration.c should work without changes. Still some details need to be taken care of: * scheduler (see the housekeeping_mask() references, especially the ilb which is my biggest worry, get_nohz_timer_target() shouldn't be an issue) * posix cpu timers (make tick_dep unconditional ?) * kernel/watchdog.c (make proc_watchdog_cpumask() and lockup_detector_online_cpu() safe against update_isolation_cpumasks() * workqueue unbound mask (just apply the new one?) * some RCU tick_dep to handle (make them unconditional since they apply on slow path anyway?) * other things? (grep for tick_nohz_full_cpu(), housekeeping_* and tick_dep_* ) But we are getting closer! Thanks!