On 10/08/2016:11:48:27 PM, Pratyush Anand wrote: > On 10/08/2016:05:38:05 PM, James Morse wrote: > > =========================%<========================= > > diff --git a/arch/arm64/kernel/crash_dump.c b/arch/arm64/kernel/crash_dump.c > > index 2dc54d129be1..784d4c30b534 100644 > > --- a/arch/arm64/kernel/crash_dump.c > > +++ b/arch/arm64/kernel/crash_dump.c > > @@ -37,6 +37,11 @@ ssize_t copy_oldmem_page(unsigned long pfn, char *buf, > > if (!csize) > > return 0; > > > > + if (memblock_is_memory(pfn << PAGE_SHIFT) && > > + !memblock_is_map_memory(pfn << PAGE_SHIFT)) > > + /* skip this nomap memory region, reserved by firmware */ > > + return 0; This should return 0 or -EINVAL? because, its caller does not care properly about 0 return value (when csize is non-zero). So either we need to return -EINVAL or we need to fix it's caller so that pread() would know that required number of data were not read. > > + > > vaddr = ioremap_cache(__pfn_to_phys(pfn), PAGE_SIZE); > > if (!vaddr) > > return -ENOMEM; > > =========================%<========================= > > In any case kernel must not panic, so I think we must have above hunk. However, > we also need to look into kexec-tools that why it is asking kernel to copy those > unneeded chunks. > > I will test tomorrow with above hunk. After that hunk it did not crash but vmcore-dmesg fails with following message: "No program header covering vaddr 0x401ff0found kexec bug?" It happened because vmcore-dmesg is sending wrong offset to the pread(), and so it did not crash after the above kernel hunk but it still read garbage wrong log_buf virtual address pointer. vmcore-dmesg is sending wrong offset because page_offset(vp_offset) calculation is not perfect for my case, explained here [1]. So, if I correct page_offset(vp_offset) (as arm64_mem.page_offset = ehdr.e_entry - "kernel Code Start PA" + phys_offset), then vmcore-dmesg and vmcore copy worked fine, however if I use makedumpfile to copy(compressed) data from /proc/vmcore then it still generates "synchronous external abort". I think, it generated because it would have found garbage data in EFI memory region. My /proc/iomem shows following: 8000000000-8001e7ffff : System RAM 8001e80000-83ff17ffff : System RAM 8002080000-8002b3ffff : Kernel code 8002c40000-800348ffff : Kernel data 807fe00000-80ffdfffff : Crash kernel 83ff180000-83ff1cffff : System RAM 83ff1d0000-83ff21ffff : System RAM 83ff220000-83ffe4ffff : System RAM 83ffe50000-83ffffffff : System RAM If I clip all the region before "kernel code" and provide that clipped input to kexec-tools then everything works fine. ~Pratyush [1] http://lists.infradead.org/pipermail/kexec/2016-August/016834.html