Hi Qian Cai, On Thu, Dec 13, 2018 at 10:53 AM Qian Cai <cai@xxxxxx> wrote: > > On this HPE Apollo 70 arm64 server with 256 CPUs, triggering a crash > dump just hung. It has 4 threads on each core. Each 2-core share a same > L1 and L2 caches, so that is 8 CPUs shares those. All CPUs share a same > L3 cache. > > It turned out that this was due to the TLB contained stale entries (or > uninitialized junk which just happened to look valid) from the first > kernel before turning the MMU on in the second kernel which caused this > instruction hung, > > msr sctlr_el1, x0 > > Signed-off-by: Qian Cai <cai@xxxxxx> > --- > arch/arm64/kernel/head.S | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S > index 4471f570a295..5196f3d729de 100644 > --- a/arch/arm64/kernel/head.S > +++ b/arch/arm64/kernel/head.S > @@ -771,6 +771,10 @@ ENTRY(__enable_mmu) > msr ttbr0_el1, x2 // load TTBR0 > msr ttbr1_el1, x1 // load TTBR1 > isb > + dsb nshst > + tlbi vmalle1 // invalidate TLB > + dsb nsh > + isb This will be executed both for the primary and kdump kernel, right? I don't think we really want to invalidate the TLB when booting the primary kernel. It would be too slow and considering that we need to minimize boot timings on embedded arm64 devices, I think it would not be a good idea. > msr sctlr_el1, x0 > isb > /* > -- > 2.17.2 (Apple Git-113) > Also did you check this issue I reported on the HPE apollo machines some days back with the kdump kernel boot <https://www.spinics.net/lists/kexec/msg21750.html>. Can you please confirm that you are not facing the same issue (as I suspect from reading your earlier Bug Report) on the HPE apollo machine. Also adding 'earlycon' to the bootargs being passed to the kdump kernel you can see if you are able to atleast get some console output from the kdump kernel. Thanks, Bhupesh _______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec