Hi Lianbo, On 10/07/19 at 03:08pm, Lianbo Jiang wrote: > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793 > > Kdump kernel will reuse the first 640k region because of some reasons, > for example: the trampline and conventional PC system BIOS region may > require to allocate memory in this area. Obviously, kdump kernel will > also overwrite the first 640k region, therefore, kernel has to copy > the contents of the first 640k area to a backup area, which is done in > purgatory(), because vmcore may need the old memory. When vmcore is > dumped, kdump kernel will read the old memory from the backup area of > the first 640k area. > > Basically, the main reason should be clear, kernel does not correctly > handle the first 640k region when SME is active, which causes that > kernel does not properly copy these old memory to the backup area in > purgatory(). Therefore, kdump kernel reads out the incorrect contents > from the backup area when dumping vmcore. Finally, the phenomenon is > as follow: > > [root linux]$ crash vmlinux /var/crash/127.0.0.1-2019-09-19-08\:31\:27/vmcore > WARNING: kernel relocated [240MB]: patching 97110 gdb minimal_symbol values > > KERNEL: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmlinux > DUMPFILE: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmcore [PARTIAL DUMP] > CPUS: 128 > DATE: Thu Sep 19 08:31:18 2019 > UPTIME: 00:01:21 > LOAD AVERAGE: 0.16, 0.07, 0.02 > TASKS: 1343 > NODENAME: amd-ethanol > RELEASE: 5.3.0-rc7+ > VERSION: #4 SMP Thu Sep 19 08:14:00 EDT 2019 > MACHINE: x86_64 (2195 Mhz) > MEMORY: 127.9 GB > PANIC: "Kernel panic - not syncing: sysrq triggered crash" > PID: 9789 > COMMAND: "bash" > TASK: "ffff89711894ae80 [THREAD_INFO: ffff89711894ae80]" > CPU: 83 > STATE: TASK_RUNNING (PANIC) > > crash> kmem -s|grep -i invalid > kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4 > kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4 > crash> > > BTW: I also tried to fix the above problem in purgatory(), but there > are too many restricts in purgatory() context, for example: i can't > allocate new memory to create the identity mapping page table for SME > situation. > > Currently, there are two places where the first 640k area is needed, > the first one is in the find_trampoline_placement(), another one is > in the reserve_real_mode(), and their content doesn't matter. To avoid > the above error, lets occupy the remain memory of the first 640k region > (expect for the trampoline and real mode) so that the allocated memory > does not fall into the first 640k area when SME is active, which makes > us not to worry about whether kernel can correctly copy the contents of > the first 640k area to a backup region in the purgatory(). > > Signed-off-by: Lianbo Jiang <lijiang@xxxxxxxxxx> > --- > Changes since v1: > 1. Improve patch log > 2. Change the checking condition from sme_active() to sme_active() > && strstr(boot_command_line, "crashkernel=") > > arch/x86/kernel/setup.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > index 77ea96b794bd..bdb1a02a84fd 100644 > --- a/arch/x86/kernel/setup.c > +++ b/arch/x86/kernel/setup.c > @@ -1148,6 +1148,9 @@ void __init setup_arch(char **cmdline_p) > > reserve_real_mode(); > > + if (sme_active() && strstr(boot_command_line, "crashkernel=")) > + memblock_reserve(0, 640*1024); > + Seems you missed the comment about "unconditionally do it", only check crashkernel param looks better. Also I noticed reserve_crashkernel is called after initmem_init, I'm not sure if memblock_reserve is good enough in early code before initmem_init. > trim_platform_memory_ranges(); > trim_low_memory_range(); > > -- > 2.17.1 > Thanks Dave _______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec