The following commit has been merged into the sched/core branch of tip: Commit-ID: 459b09b5a3254008b63382bf41a9b36d0b590f57 Gitweb: https://git.kernel.org/tip/459b09b5a3254008b63382bf41a9b36d0b590f57 Author: Valentin Schneider <valentin.schneider@xxxxxxx> AuthorDate: Tue, 18 May 2021 14:07:25 +01:00 Committer: Peter Zijlstra <peterz@xxxxxxxxxxxxx> CommitterDate: Mon, 28 Jun 2021 15:42:24 +02:00 sched/debug: Don't update sched_domain debug directories before sched_debug_init() Since CPU capacity asymmetry can stem purely from maximum frequency differences (e.g. Pixel 1), a rebuild of the scheduler topology can be issued upon loading cpufreq, see: arch_topology.c::init_cpu_capacity_callback() Turns out that if this rebuild happens *before* sched_debug_init() is run (which is a late initcall), we end up messing up the sched_domain debug directory: passing a NULL parent to debugfs_create_dir() ends up creating the directory at the debugfs root, which in this case creates /sys/kernel/debug/domains (instead of /sys/kernel/debug/sched/domains). This currently doesn't happen on asymmetric systems which use cpufreq-scpi or cpufreq-dt drivers, as those are loaded via deferred_probe_initcall() (it is also a late initcall, but appears to be ordered *after* sched_debug_init()). Ionela has been working on detecting maximum frequency asymmetry via ACPI, and that actually happens via a *device* initcall, thus before sched_debug_init(), and causes the aforementionned debugfs mayhem. One option would be to punt sched_debug_init() down to fs_initcall_sync(). Preventing update_sched_domain_debugfs() from running before sched_debug_init() appears to be the safer option. Fixes: 3b87f136f8fc ("sched,debug: Convert sysctl sched_domains to debugfs") Signed-off-by: Valentin Schneider <valentin.schneider@xxxxxxx> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> Link: http://lore.kernel.org/r/20210514095339.12979-1-ionela.voinescu@xxxxxxx --- kernel/sched/debug.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c index 0c5ec27..7e08e3d 100644 --- a/kernel/sched/debug.c +++ b/kernel/sched/debug.c @@ -388,6 +388,13 @@ void update_sched_domain_debugfs(void) { int cpu, i; + /* + * This can unfortunately be invoked before sched_debug_init() creates + * the debug directory. Don't touch sd_sysctl_cpus until then. + */ + if (!debugfs_sched) + return; + if (!cpumask_available(sd_sysctl_cpus)) { if (!alloc_cpumask_var(&sd_sysctl_cpus, GFP_KERNEL)) return;