On Tue, Nov 10, 2015 at 10:23:56AM +0900, AKASHI Takahiro wrote: > On 11/07/2015 04:14 AM, Geoff Levand wrote: > >From: AKASHI Takahiro <takahiro.akashi at linaro.org> > > > >kdump calls machine_crash_shutdown() to shut down non-boot cpus and > >save registers' status in per-cpu ELF notes before starting the crash > >dump kernel. See kernel_kexec(). > > > >ipi_cpu_stop() is a bit modified and used to support this behavior. > > I've got some concerns of using ipi_cpu_stop(). > > >Signed-off-by: AKASHI Takahiro <takahiro.akashi at linaro.org> > >--- > > arch/arm64/include/asm/kexec.h | 34 +++++++++++++++++++++++++++++++++- > > arch/arm64/kernel/machine_kexec.c | 31 +++++++++++++++++++++++++++++-- > > arch/arm64/kernel/smp.c | 16 ++++++++++++++-- > > 3 files changed, 76 insertions(+), 5 deletions(-) [...] > >diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c > >index dbdaacd..88aec66 100644 > >--- a/arch/arm64/kernel/smp.c > >+++ b/arch/arm64/kernel/smp.c > >@@ -37,6 +37,7 @@ > > #include <linux/completion.h> > > #include <linux/of.h> > > #include <linux/irq_work.h> > >+#include <linux/kexec.h> > > > > #include <asm/alternative.h> > > #include <asm/atomic.h> > >@@ -54,6 +55,8 @@ > > #include <asm/ptrace.h> > > #include <asm/virt.h> > > > >+#include "cpu-reset.h" > >+ > > #define CREATE_TRACE_POINTS > > #include <trace/events/ipi.h> > > > >@@ -679,8 +682,12 @@ static DEFINE_RAW_SPINLOCK(stop_lock); > > /* > > * ipi_cpu_stop - handle IPI from smp_send_stop() > > */ > >-static void ipi_cpu_stop(unsigned int cpu) > >+static void ipi_cpu_stop(unsigned int cpu, struct pt_regs *regs) > > { > >+#ifdef CONFIG_KEXEC > >+ /* printing messages may slow down the shutdown. */ > >+ if (!in_crash_kexec) > >+#endif > > if (system_state == SYSTEM_BOOTING || > > system_state == SYSTEM_RUNNING) { > > raw_spin_lock(&stop_lock); > >@@ -693,6 +700,11 @@ static void ipi_cpu_stop(unsigned int cpu) > > > > local_irq_disable(); > > > >+#ifdef CONFIG_KEXEC > >+ if (in_crash_kexec) > >+ crash_save_cpu(regs, cpu); > >+#endif /* CONFIG_KEXEC */ > >+ > > while (1) > > cpu_relax(); > > } > > cpu_relax() is defined as asm("yield"), and this puts all but boot cpu into > a infinite loop of nop (actually, whether nop or other depends on hw implementation). > Thus all the secondary cpus are still running busy loop even after crash dump kernel > has started up, and the chip can potentially get overheated. > I ran into this situation when I tested the code on Hikey, and the system was > forced to be shut down by thermal driver. > > So I'd like to modify the code a bit like: > if (in_crash_kernel { > crash_save_cpu(regs, cpu); > while (1) > asm("wfi"); /* irq is disabled here. */ > } > > Does this make sense? It would be even better if we could hotplug them off. Will