Hello, Seen a race condition in cpufrequency driver. cpu2 is being hot-plugged out. And this started at say, 20th msec. -000|context_switch(inline) -000|need_resched() -001|preempt_schedule(inline) -001|preempt_schedule() -002|static_key_false(inline) -002|trace_sched_cpu_hotplug(inline) -002|cpu_down(cpu = 2, ?) -003|cpu_down(cpu = 2) -004|update_offline_cores(?) -005|do_hotplug(?) -006|kthread(_create = 0xEE85BEBC) cpu1 is updating the governor at say 60th msec. echo "some_governor" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor at 60th msec, cpu2 is already hot-plugged but CPU_POST_DEAD has not called because __cpu_down was scheduled out at cpu_hotplug_done(while unlocking mutex) now store_scaling_governor calls cpufreq_set_policy CPUFREQ_GOV_START iterates through all cpus in policy->cpus(which is not the correct one now, because cpu2 is already hot-plugged out(DEAD) but not updated in policy->cpus). One suggestion is to use CPU_DEAD instead of CPU_POST_DEAD in cpufreq.c diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 644b54e..5fdaf06 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -2325,7 +2325,7 @@ static int cpufreq_cpu_callback(struct notifier_block *nfb, __cpufreq_remove_dev_prepare(dev, NULL); break; - case CPU_POST_DEAD: + case CPU_DEAD: __cpufreq_remove_dev_finish(dev, NULL); break; Or add a mutex to serialize the context. Appreciate your valuable comments. Thanks, Arun -- To unsubscribe from this list: send the line "unsubscribe cpufreq" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html