On Mon, Mar 21, 2016 at 01:29:28PM +0000, James Morse wrote: > Hi! > > On 18/03/16 18:08, James Morse wrote: > > On 14/03/16 17:48, Geoff Levand wrote: > >> From: AKASHI Takahiro <takahiro.akashi at linaro.org> > >> > >> Primary kernel calls machine_crash_shutdown() to shut down non-boot cpus > >> and save registers' status in per-cpu ELF notes before starting crash > >> dump kernel. See kernel_kexec(). > >> Even if not all secondary cpus have shut down, we do kdump anyway. > >> > >> As we don't have to make non-boot(crashed) cpus offline (to preserve > >> correct status of cpus at crash dump) before shutting down, this patch > >> also adds a variant of smp_send_stop(). > > >> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c > >> index b1adc51..76402c6cd 100644 > >> --- a/arch/arm64/kernel/smp.c > >> +++ b/arch/arm64/kernel/smp.c > >> @@ -701,6 +705,28 @@ static void ipi_cpu_stop(unsigned int cpu) > >> cpu_relax(); > >> } > >> > >> +static atomic_t waiting_for_crash_ipi; > >> + > >> +static void ipi_cpu_crash_stop(unsigned int cpu, struct pt_regs *regs) > >> +{ > >> + crash_save_cpu(regs, cpu); > >> + > >> + raw_spin_lock(&stop_lock); > >> + pr_debug("CPU%u: stopping\n", cpu); > >> + raw_spin_unlock(&stop_lock); > >> + > >> + atomic_dec(&waiting_for_crash_ipi); > >> + > >> + local_irq_disable(); > >> + > >> + if (cpu_ops[cpu]->cpu_die) > >> + cpu_ops[cpu]->cpu_die(cpu); > >> + > >> + /* just in case */ > >> + while (1) > >> + wfi(); > > Having thought about this some more: I don't think spinning like this is safe. > We need to spin with the MMU turned off, otherwise this core will pollute the > kdump kernel with TLB entries from the old page tables. I think that wfi() will never wake up since local interrupts are disabled here. So how can it pollute the kdump kernel? > Suzuki added code to > catch this happening with cpu hotplug (grep CPU_STUCK_IN_KERNEL in > arm64/for-next/core), but that won't help here. If 'CPU_STUCK_IN_KERNEL' was set > by a core, I don't think we can kexec/kdump for this reason. I will need to look into Suzuki's code. > Something like cpu_die() for spin-table is needed, naively I think it should > turn the MMU off, and jump back into the secondary_holding_pen, but the core > would still be stuck in the kernel, and the memory addresses associated with > secondary_holding_pen can't be re-used. (which is fine for kdump, but not kexec) Please note that the code is exercised only in kdump case through machine_crash_shutdown(). Thanks, -Takahiro AKASHI > > Thanks, > > James >