On Mon, 2024-11-25 at 10:29 +0000, David Woodhouse wrote: > On Mon, 2024-11-25 at 09:54 +0000, David Woodhouse wrote: > > From: David Woodhouse <dwmw@xxxxxxxxxxxx> > > > > The control_code_page should be explicitly mapped into the identity > > mapped page tables for the relocate_kernel environment. This only seems > > to have worked by luck before, because it tended to be within the same > > 2MiB or 1GiB large page already mapped for another reason. > > > > A subsequent commit will reduce the control_code_page to a single 4KiB > > page instead of a higher-order allocation, and seems to make it much > > *less* likely that we get lucky with its placement. This leads to a > > fault when relocate_kernel() first tries to access the page through its > > identity-mapped virtual address. > > This one is confusing me. Jan points out that it shouldn't be needed, > because the control page should come from kernel memory and thus should > be mapped anyway because the loop immediately below my added code adds > *all* of the pfn_mapped[] ranges. I think we understand this one now; it's because of PTI. So where the identmap code in e.g. ident_p4d_init() calls set_pte(), set_pte() is actually trying to write *both* the kernel and userspace copies of the page table, which it expects to be in adjacent pages. But in this case it's just scribbling over the end of the single 4KiB page that was allocated for it. This should suffice to mask the problem (testing now) but obviously it isn't a great solution: --- a/arch/x86/kernel/machine_kexec_64.c +++ b/arch/x86/kernel/machine_kexec_64.c @@ -213,7 +213,7 @@ static void *alloc_pgt_page(void *data) struct page *page; void *p = NULL; - page = kimage_alloc_control_pages(image, 0); + page = kimage_alloc_control_pages(image, 1); if (page) { p = page_address(page); clear_page(p);
<<attachment: smime.p7s>>