[Reinstating the original Cc list] On 01/19/2012 09:50 PM, Mel Gorman wrote:> > On a different x86-64 machines with an intel-specific MCE, I have > also noted that the value of num_online_cpus() can change while > stop_machine() is running. That is expected and intentional right? Meaning, it is during the stop_machine() thing itself that a CPU is actually taken offline. And at the same time, it is removed from the cpu_online_mask. On Intel boxes, essentially, the following gets executed on the dying CPU, as set up by the stop_machine stuff. __cpu_disable() native_cpu_disable() cpu_disable_common() remove_cpu_from_maps() set_cpu_online(cpu, false) ^^^^^^ So, set_cpu_online will remove this CPU from the cpu_online_mask. And all this runs while still under the stop machine context. And this is exactly what we want right? > This is sensitive to timing and part of > the problem seems to be due to cmci_rediscover() running without the > CPU hotplug mutex held. This is not related to the IPI mess and is > unrelated to memory pressure but is just to note that CPU hotplug in > general can be fragile in parts. > For the cmci_rediscover() part, I feel a simple get/put_online_cpus() around it should work. Something like the following patch? (It is untested at the moment, but I will run it later and see if it works well). I would like the opinion of MCE/Intel maintainers as to whether this is a proper fix or something else would have been better.. ---- From: Srivatsa S. Bhat <srivatsa.bhat@xxxxxxxxxxxxxxxxxx> Subject: [PATCH] x86/intel mce: Fix race with CPU hotplug in cmci_rediscover() cmci_rediscover() is invoked upon the CPU_POST_DEAD notification, when the cpu_hotplug lock is no longer held. And cmci_rediscover() iterates over all the online cpus. So this can race with an ongoing CPU hotplug operation. Fix this by wrapping the iteration code within the pair get_online_cpus() / put_online_cpus(). Reported-by: Mel Gorman <mgorman@xxxxxxx> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@xxxxxxxxxxxxxxxxxx> --- arch/x86/kernel/cpu/mcheck/mce_intel.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c index 38e49bc..1c30397 100644 --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c @@ -10,6 +10,7 @@ #include <linux/interrupt.h> #include <linux/percpu.h> #include <linux/sched.h> +#include <linux/cpu.h> #include <asm/apic.h> #include <asm/processor.h> #include <asm/msr.h> @@ -179,6 +180,7 @@ void cmci_rediscover(int dying) return; cpumask_copy(old, ¤t->cpus_allowed); + get_online_cpus(); for_each_online_cpu(cpu) { if (cpu == dying) continue; @@ -188,6 +190,7 @@ void cmci_rediscover(int dying) if (cmci_supported(&banks)) cmci_discover(banks, 0); } + put_online_cpus(); set_cpus_allowed_ptr(current, old); free_cpumask_var(old); -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html