Hi, On Thu, Feb 02, 2017 at 10:45:58AM +0000, James Morse wrote: > On 01/02/17 18:25, Mark Rutland wrote: > > On Wed, Feb 01, 2017 at 06:00:08PM +0000, Mark Rutland wrote: > >> On Wed, Feb 01, 2017 at 09:46:24PM +0900, AKASHI Takahiro wrote: > >>> arch_kexec_protect_crashkres() and arch_kexec_unprotect_crashkres() > >>> are meant to be called around kexec_load() in order to protect > >>> the memory allocated for crash dump kernel once after it's loaded. > >>> > >>> The protection is implemented here by unmapping the region rather than > >>> making it read-only. > >>> To make the things work correctly, we also have to > >>> - put the region in an isolated, page-level mapping initially, and > >>> - move copying kexec's control_code_page to machine_kexec_prepare() > >>> > >>> Note that page-level mapping is also required to allow for shrinking > >>> the size of memory, through /sys/kernel/kexec_crash_size, by any number > >>> of multiple pages. > > >> Looking at kexec_crash_size_store(), I don't see where memory returned > >> to the OS is mapped. AFAICT, if the region is protected when the user > >> shrinks the region, the memory will not be mapped, yet handed over to > >> the kernel for general allocation. > > kernel/kexec_core.c:crash_shrink_memory() will bailout: > > if (kexec_crash_image) { > > ret = -ENOENT; > > goto unlock; > > } > > So it should only be possible to return memory to the allocators when there is > no crash image loaded, so the area is mapped. Ah. That being the case, there should be no problem. Any part given back to the linear map will be at page granularity, but that won't result in any functional problem. > Given the core code can unload a crash image, this hunk has me worried: > +void arch_kexec_unprotect_crashkres(void) > +{ > + /* > + * We don't have to make page-level mappings here because > + * the crash dump kernel memory is not allowed to be shrunk > + * once the kernel is loaded. > + */ > + create_pgd_mapping(&init_mm, crashk_res.start, > + __phys_to_virt(crashk_res.start), > + resource_size(&crashk_res), PAGE_KERNEL, > + debug_pagealloc_enabled()); > > > I don't think this is true if the order is: load -> protect, unload -> > unprotect, shrink. The shrink will happen with potentially non-page-size mappings. Let's consistently use page mappings for the crashkernel region. Thanks, Mark.