Hi Akashi, On 24/01/17 08:49, AKASHI Takahiro wrote: > To protect the memory reserved for crash dump kernel once after loaded, > arch_kexec_protect_crashres/unprotect_crashres() are meant to deal with > permissions of the corresponding kernel mappings. > > We also have to > - put the region in an isolated mapping, and > - move copying kexec's control_code_page to machine_kexec_prepare() > so that the region will be completely read-only after loading. > Note that the region must reside in linear mapping and have corresponding > page structures in order to be potentially freed by shrinking it through > /sys/kernel/kexec_crash_size. Nasty! Presumably you have to build the crash region out of individual page mappings so that they can be returned to the slab-allocator one page at a time, and still be able to set/clear the valid bits on the remaining chunk. (I don't see how that happens in this patch) debug_pagealloc has to do this too so it can flip the valid bits one page at a time. You could change the debug_pagealloc_enabled() value passed in at the top __create_pgd_mapping() level to be a needs_per_page_mapping(addr, size) test that happens as we build the linear map. (This would save the 3 extra calls to __create_pgd_mapping() in __map_memblock()) I'm glad to see you can't resize the region if a crash kernel is loaded! This secretly-unmapped is the sort of thing that breaks hibernate, it blindly assumes pfn_valid() means it can access the page if it wants to. Setting PG_Reserved is a quick way to trick it out of doing this, but that would leave the crash kernel region un-initialised after resume, while kexec_crash_image still has a value. I think the best fix for this is to forbid hibernate if kexec_crash_loaded() arguing these are mutually-exclusive features, and the protect crash-dump feature exists to prevent things like hibernate corrupting the crash region. > diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c > index bc96c8a7fc79..f7938fecf3ff 100644 > --- a/arch/arm64/kernel/machine_kexec.c > +++ b/arch/arm64/kernel/machine_kexec.c > @@ -159,32 +171,20 @@ void machine_kexec(struct kimage *kimage) > kimage->control_code_page); > pr_debug("%s:%d: reboot_code_buffer_phys: %pa\n", __func__, __LINE__, > &reboot_code_buffer_phys); > - pr_debug("%s:%d: reboot_code_buffer: %p\n", __func__, __LINE__, > - reboot_code_buffer); > pr_debug("%s:%d: relocate_new_kernel: %p\n", __func__, __LINE__, > arm64_relocate_new_kernel); > pr_debug("%s:%d: relocate_new_kernel_size: 0x%lx(%lu) bytes\n", > __func__, __LINE__, arm64_relocate_new_kernel_size, > arm64_relocate_new_kernel_size); > > - /* > - * Copy arm64_relocate_new_kernel to the reboot_code_buffer for use > - * after the kernel is shut down. > - */ > - memcpy(reboot_code_buffer, arm64_relocate_new_kernel, > - arm64_relocate_new_kernel_size); > - > - /* Flush the reboot_code_buffer in preparation for its execution. */ > - __flush_dcache_area(reboot_code_buffer, arm64_relocate_new_kernel_size); > - flush_icache_range((uintptr_t)reboot_code_buffer, > - arm64_relocate_new_kernel_size); > - /* Flush the kimage list and its buffers. */ > - kexec_list_flush(kimage); > + if (kimage != kexec_crash_image) { > + /* Flush the kimage list and its buffers. */ > + kexec_list_flush(kimage); > > - /* Flush the new image if already in place. */ > - if (kimage->head & IND_DONE) > - kexec_segment_flush(kimage); > + /* Flush the new image if already in place. */ > + if (kimage->head & IND_DONE) > + kexec_segment_flush(kimage); > + } So for kdump we cleaned the kimage->segment[i].mem regions in arch_kexec_protect_crashkres(), so don't need to do it here. What about the kimage->head[i] array of list entries that were cleaned by kexec_list_flush()? Now we don't clean that for kdump either, but we do pass it arm64_relocate_new_kernel() at the end of this function: > cpu_soft_restart(1, reboot_code_buffer_phys, kimage->head, kimage_start, 0); Can we test the IND_DONE_BIT of kimage->head, so that we know that arm64_relocate_new_kernel() won't try to walk the unclean list? Alternatively we could call kexec_list_flush() in arch_kexec_protect_crashkres() too. Thanks, James