Re: [RFC PATCH v1] Add kthreads_update_affinity()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Costa,

Le Mon, Jan 13, 2025 at 09:08:54PM +0200, Costa Shulyupin a écrit :
> Changing and using housekeeping and isolated CPUs requires reboot.
> 
> The goal is to change CPU isolation dynamically without reboot.
> 
> This patch is based on the parent patch
> cgroup/cpuset: Exclude isolated CPUs from housekeeping CPU masks
> https://lore.kernel.org/lkml/20240821142312.236970-3-longman@xxxxxxxxxx/
> Its purpose is to update isolation cpumasks.
> 
> However, some subsystems may use outdated housekeeping CPU masks. To
> prevent the use of these isolated CPUs, it is essential to explicitly
> propagate updates to the housekeeping masks across all subsystems that
> depend on them.
> 
> This patch is not intended to be merged and disrupt the kernel.
> It is still a proof-of-concept for research purposes.
> 
> The questions are:
> - Is this the right direction, or should I explore an alternative approach?
> - What factors need to be considered?
> - Any suggestions or advice?
> - Have similar attempts been made before?

Since the kthreads preferred affinity patchset just got merged,
I don't think anything needs to be done anymore in the kthreads front to toggle
nohz_full through cpusets safely. Let's see the details below:

> 
> Update the affinity of kthreadd and trigger the recalculation of kthread
> affinities using kthreads_online_cpu().
> 
> The argument passed to kthreads_online_cpu() is irrelevant, as the
> function reassigns affinities of kthreads based on their
> preferred_affinity and the updated housekeeping_cpumask(HK_TYPE_KTHREAD).
> 
> Currently only RCU uses kthread_affine_preferred().
> 
> I dare to try calling kthread_affine_preferred() from kthread_run() to
> set preferred_affinity as cpu_possible_mask for kthreads without a
> specific affinity, enabling their management through
> kthreads_online_cpu().
> 
> Any objections?
> 
> For details about kthreads affinity patterns please see:
> https://lore.kernel.org/lkml/20241211154035.75565-16-frederic@xxxxxxxxxx/
> 
> Signed-off-by: Costa Shulyupin <costa.shul@xxxxxxxxxx>
> ---
>  include/linux/kthread.h | 5 ++++-
>  kernel/cgroup/cpuset.c  | 1 +
>  kernel/kthread.c        | 6 ++++++
>  3 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/kthread.h b/include/linux/kthread.h
> index 8d27403888ce9..b43c5aeb2cfd7 100644
> --- a/include/linux/kthread.h
> +++ b/include/linux/kthread.h
> @@ -52,8 +52,10 @@ bool kthread_is_per_cpu(struct task_struct *k);
>  ({									   \
>  	struct task_struct *__k						   \
>  		= kthread_create(threadfn, data, namefmt, ## __VA_ARGS__); \
> -	if (!IS_ERR(__k))						   \
> +	if (!IS_ERR(__k)) {						   \
> +		kthread_affine_preferred(__k, cpu_possible_mask);	   \
>  		wake_up_process(__k);					   \
> +	}								   \
>  	__k;								   \
>  })
>  
> @@ -270,4 +272,5 @@ struct cgroup_subsys_state *kthread_blkcg(void);
>  #else
>  static inline void kthread_associate_blkcg(struct cgroup_subsys_state *css) { }
>  #endif
> +void kthreads_update_affinity(void);
>  #endif /* _LINUX_KTHREAD_H */
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index 65658a5c2ac81..7d71acc7f46b6 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -1355,6 +1355,7 @@ static void update_isolation_cpumasks(bool isolcpus_updated)
>  	trl();
>  	ret = housekeeping_exlude_isolcpus(isolated_cpus, HOUSEKEEPING_FLAGS);
>  	WARN_ON_ONCE((ret < 0) && (ret != -EOPNOTSUPP));
> +	kthreads_update_affinity();

A few things to consider:

1) update_isolation_cpumasks() will be called with cpus_read_lock()
  (cf: sched_partition_write() and cpuset_write_resmask()), therefore
  kthreads_online_cpu() can't run concurrently.

2) The constraint to turn on/off a CPU as nohz_full will be that the
   target CPU is offline.

So when the target CPU will later become offline, kthreads_online_cpu()
will take care of the newly modified housekeeping_mask() (which will be visible
because cpus_write_lock() is held) and apply accordingly the appropriate
effective affinity.

The only thing that might need to be done by update_isolation_cpumasks() is
to hold kthreads_hotplug_lock so that a subsequent call to
kthread_affine_preferred() sees the "freshest" update to housekeeping_mask().
But even that may not be mandatory because the concerned CPUs are offline
and kthreads_online_cpu() will fix the race when they boot.

Oh and the special case kthreadd might need a direct affinity update.

So the good news is that we have a lot of things sorted out to prepare
for that cpuset interface:

_ RCU NOCB is ready
_ kthreads are ready
_ timers are fine since we toggle only offline CPUs and timer_migration.c
  should work without changes.

Still some details need to be taken care of:

* scheduler (see the housekeeping_mask() references, especially the ilb which is
  my biggest worry, get_nohz_timer_target() shouldn't be an issue)
  
* posix cpu timers (make tick_dep unconditional ?)

* kernel/watchdog.c (make proc_watchdog_cpumask() and
  lockup_detector_online_cpu() safe against update_isolation_cpumasks()
  
* workqueue unbound mask (just apply the new one?)

* some RCU tick_dep to handle (make them unconditional since they
  apply on slow path anyway?)
  
* other things? (grep for tick_nohz_full_cpu(), housekeeping_* and tick_dep_* )

But we are getting closer!

Thanks!




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux