Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes: > On Thu, Oct 29, 2020 at 02:18:45PM -0400, Daniel Jordan wrote: >> rebuild_sched_domains_locked() prevented the race during the cgroup2 >> cpuset series up until the Fixes commit changed its check. Make the >> check more robust so that it can detect an offline CPU in any exclusive >> cpuset's effective mask, not just the top one. > > *groan*, what a mess... Ah, the joys of cpu hotplug! >> I think the right thing to do long-term is make the hotplug work >> synchronous, fixing the lockdep splats of past attempts, and then take >> these checks out of rebuild_sched_domains_locked, but this fixes the >> immediate issue and is small enough for stable. Open to suggestions. >> >> Prateek, are you planning on picking up your patches again? > > Yeah, that might help, but those deadlocks were nasty iirc :/ It might end up being too invasive to be worth it, but I'm being optimistic for now. >> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c >> index 57b5b5d0a5fd..ac3124010b2a 100644 >> --- a/kernel/cgroup/cpuset.c >> +++ b/kernel/cgroup/cpuset.c >> @@ -983,8 +983,10 @@ partition_and_rebuild_sched_domains(int ndoms_new, cpumask_var_t doms_new[], >> */ >> static void rebuild_sched_domains_locked(void) >> { >> + struct cgroup_subsys_state *pos_css; >> struct sched_domain_attr *attr; >> cpumask_var_t *doms; >> + struct cpuset *cs; >> int ndoms; >> >> lockdep_assert_cpus_held(); >> @@ -999,9 +1001,21 @@ static void rebuild_sched_domains_locked(void) >> !cpumask_equal(top_cpuset.effective_cpus, cpu_active_mask)) >> return; > > So you argued above that effective_cpus was stale, I suppose the above > one works because its an equality test instead of a subset? Yep, fortunately enough. > Does that wants a comment? Ok, I'll change the comments to this absent other ideas. /* * If we have raced with CPU hotplug, return early to avoid * passing doms with offlined cpu to partition_sched_domains(). * Anyways, cpuset_hotplug_workfn() will rebuild sched domains. * * With no CPUs in any subpartitions, top_cpuset's effective CPUs * should be the same as the active CPUs, so checking only top_cpuset * is enough to detect racing CPU offlines. */ if (!top_cpuset.nr_subparts_cpus && !cpumask_equal(top_cpuset.effective_cpus, cpu_active_mask)) return; /* * With subpartition CPUs, however, the effective CPUs of a partition * root should be only a subset of the active CPUs. Since a CPU in any * partition root could be offlined, all must be checked. */ if (top_cpuset.nr_subparts_cpus) { rcu_read_lock(); ... Thanks for looking.