On 14/06/18 15:18, Quentin Perret wrote: > On Thursday 14 Jun 2018 at 16:11:18 (+0200), Juri Lelli wrote: > > On 14/06/18 14:58, Quentin Perret wrote: > > > > [...] > > > > > Hmm not sure if this can help but I think that rebuild_sched_domains() > > > does _not_ take the hotplug lock before calling partition_sched_domains() > > > when CONFIG_CPUSETS=n. But it does take it for CONFIG_CPUSETS=y. > > > > Did you mean cpuset_mutex? > > Nope, I really meant the cpu_hotplug_lock ! > > With CONFIG_CPUSETS=n, rebuild_sched_domains() calls > partition_sched_domains() directly: > > https://elixir.bootlin.com/linux/latest/source/include/linux/cpuset.h#L255 > > But with CONFIG_CPUSETS=y, rebuild_sched_domains() calls, > rebuild_sched_domains_locked(), which calls get_online_cpus() which > calls cpus_read_lock(), which does percpu_down_read(&cpu_hotplug_lock). > And all that happens before calling partition_sched_domains(). Ah, right! > So yeah, the point I was trying to make is that there is an inconsistency > here, maybe for a good reason ? Maybe related to the issue you're seeing ? The config that came with the 0day splat was indeed CONFIG_CPUSETS=n. So, in this case IIUC we hit the !doms_new branch of partition_sched_ domains, which uses cpu_active_mask (and cpu_possible_mask indirectly). Should this be still protected by the hotplug lock then? -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html