Hello, On Wed, Jul 27, 2022 at 08:58:14PM -0400, Waiman Long wrote: > It was found that any change to the current cpuset hierarchy may reset > the cpus_allowed list of the tasks in the affected cpusets to the > default cpuset value even if those tasks have cpus affinity explicitly > set by the users before. That is especially easy to trigger under a > cgroup v2 environment where writing "+cpuset" to the root cgroup's > cgroup.subtree_control file will reset the cpus affinity of all the > processes in the system. > > That is especially problematic in a nohz_full environment where the > tasks running in the nohz_full CPUs usually have their cpus affinity > explicitly set and will behave incorrectly if cpus affinity changes. > > Fix this problem by adding a flag in the task structure to indicate that > a task has their cpus affinity explicitly set before and make cpuset > code not to change their cpus_allowed list unless the user chosen cpu > list is no longer a subset of the cpus_allowed list of the cpuset itself. > > With that change in place, it was verified that tasks that have its > cpus affinity explicitly set will not be affected by changes made to > the v2 cgroup.subtree_control files. I think the underlying cause here is cpuset overwriting the cpumask the user configured but that's a longer discussion. > +/* > + * Don't change the cpus_allowed list if cpus affinity has been explicitly > + * set before unless the current cpu list is not a subset of the new cpu list. > + */ > +static int cpuset_set_cpus_allowed_ptr(struct task_struct *p, > + const struct cpumask *new_mask) > +{ > + if (p->cpus_affinity_set && cpumask_subset(p->cpus_ptr, new_mask)) > + return 0; > + > + p->cpus_affinity_set = 0; > + return set_cpus_allowed_ptr(p, new_mask); > +} I wonder whether the more predictable behavior would be always not resetting the cpumask if it's a subset of the new_mask. Also, shouldn't this check p->cpus_mask instead of p->cpus_ptr? Thanks. -- tejun