On 11/03/2025 15:51, Waiman Long wrote: > On 3/11/25 9:29 AM, Dietmar Eggemann wrote: >> On 11/03/2025 13:34, Waiman Long wrote: >>> On 3/11/25 7:59 AM, Juri Lelli wrote: >>>> On 10/03/25 20:16, Waiman Long wrote: >>>>> On 3/10/25 3:18 PM, Waiman Long wrote: >>>>>> On 3/10/25 2:54 PM, Dietmar Eggemann wrote: >>>>>>> On 10/03/2025 10:37, Juri Lelli wrote: [...] >> Testcase: suspend/resume >> >> on Arm64 big.LITTLE cpumask=[LITTLE][big]=[0,3-5][1-2] >> plus cmd line option 'isolcpus=3,4'. >> >> with Waiman's snippet: >> https://lkml.kernel.org/r/fd4d6143-9bd2-4a7c-80dc-1e19e4d1b2d1@xxxxxxxxxx >> >> ... >> [ 234.831675] --- > partition_sched_domains_locked() reset_domain=1 >> [ 234.835966] psci: CPU4 killed (polled 0 ms) >> [ 234.838912] Error taking CPU3 down: -16 >> [ 234.838952] Non-boot CPUs are not disabled >> [ 234.838986] Enabling non-boot CPUs ... >> ... >> >> IIRC, that's the old DL accounting issue. > > You are right. cpuhp_tasks_frozen will be set in the suspend/resume > case. In that case, we do need to add a cpuset helper to acquire the > cpuset_mutex. A test patch as follows (no testing done yet): > > diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h > index c414daa7d503..ef1ffb9c52b0 100644 > --- a/include/linux/cpuset.h > +++ b/include/linux/cpuset.h > @@ -129,6 +129,7 @@ extern void dl_rebuild_rd_accounting(void); > extern void rebuild_sched_domains(void); > > extern void cpuset_print_current_mems_allowed(void); > +extern void cpuset_reset_sched_domains(void) > > /* > * read_mems_allowed_begin is required when making decisions involving > @@ -269,6 +270,11 @@ static inline void rebuild_sched_domains(void) > partition_sched_domains(1, NULL, NULL); > } > > +static inline void cpuset_reset_sched_domains(void) > +{ > + partition_sched_domains(1, NULL, NULL); > +} > + > static inline void cpuset_print_current_mems_allowed(void) > { > } > diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c > index 7995cd58a01b..a51099e5d587 100644 > --- a/kernel/cgroup/cpuset.c > +++ b/kernel/cgroup/cpuset.c > @@ -1076,6 +1076,13 @@ void rebuild_sched_domains(void) > cpus_read_unlock(); > } > > +void cpuset_reset_sched_domains(void) > +{ > + mutex_lock(&cpuset_mutex); > + partition_sched_domains(1, NULL, NULL); > + mutex_unlock(&cpuset_mutex); > +} > + > /** > * cpuset_update_tasks_cpumask - Update the cpumasks of tasks in the > cpuset. > * @cs: the cpuset in which each task's cpus_allowed mask needs to be > changed > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 58593f4d09a1..dbf44ddbb6b4 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -8183,7 +8183,7 @@ static void cpuset_cpu_active(void) > * operation in the resume sequence, just build a single > sched > * domain, ignoring cpusets. > */ > - partition_sched_domains(1, NULL, NULL); > + cpuset_reset_sched_domains(); > if (--num_cpus_frozen) > return; > /* > @@ -8202,7 +8202,7 @@ static void cpuset_cpu_inactive(unsigned int cpu) > cpuset_update_active_cpus(); > } else { > num_cpus_frozen++; > - partition_sched_domains(1, NULL, NULL); > + cpuset_reset_sched_domains(); > } > } This seems to work. But what about a !CONFIG_CPUSETS build. In this case we won't have this DL accounting update during suspend/resume since dl_rebuild_rd_accounting() is empty.