On Sat, 26 Nov 2022 at 15:16, Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > On Tue, 15 Nov 2022 at 12:31, Will Deacon <will@xxxxxxxxxx> wrote: > > > > On Tue, Nov 15, 2022 at 12:21:55PM +0100, Ard Biesheuvel wrote: > > > On Tue, 15 Nov 2022 at 12:17, Will Deacon <will@xxxxxxxxxx> wrote: > > > > > > > > On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote: > > > > > On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote: > > > > > > The purpose of this series is to remove any explicit cache maintenance > > > > > > for coherency during early boot that becomes unnecessary if we simply > > > > > > retain the cacheable 1:1 mapping of all of system RAM provided by EFI, > > > > > > and use it to populate the ID map page tables. After setting up this > > > > > > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR, > > > > > > TCR and SCTLR registers as before, and proceed as usual, avoiding the > > > > > > need for any manipulations of memory while the MMU and caches are off. > > > > > > > > > > > > The only properties of the firmware provided 1:1 map we rely on is that > > > > > > it does not require any explicit cache maintenance for coherency, and > > > > > > that it covers the entire memory footprint of the image, including the > > > > > > BSS and padding at the end - all else is under control of the kernel > > > > > > itself, as before. > > > > > > > > > > As a high-level thing, I'm still very much not keen on entering the kernel with > > > > > the MMU on. Given that we have to support booting with the MMU off for !EFI > > > > > boot (including kexec when EFI is in use), I think this makes it harder to > > > > > reason about the boot code overall (e.g. due to the conditional maintenance > > > > > added to head.S), and adds more scope for error, even if it simplifies the EFI > > > > > stub itself. > > > > > > > > As discussed offline, two things that would help the current series are: > > > > > > > > (1) Some performance numbers comparing MMU off vs MMU on boot > > > > > > Finally got around to measuring this - I lost access to my TX2 machine > for a couple of days during the past week, > > With the patch below applied to mainline, I measure ~6 ms spent > cleaning the entire image to the PoC (which is the bulk of it) and > subsequently populating the initial ID map and activating it. > > This drops to about 0.6 ms with my changes applied. This is unlikely > to ever matter in practice, perhaps, but I will note that booting a VM > in EFI mode using Tianocore/EDK2 from the point where KVM clears the > counter to the point where we start user space can be done (on the > same machine) in 500-700 ms so it is not entirely insignificant > either. > > I could try and measure it on bare metal as well, but I suppose that > launch times are even less relevant there so I didn't bother. diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S index 61a87fa1c3055e26..27f59784a1c0be2c 100644 --- a/arch/arm64/kernel/efi-entry.S +++ b/arch/arm64/kernel/efi-entry.S @@ -22,6 +22,7 @@ SYM_CODE_START(efi_enter_kernel) ldr w2, =primary_entry_offset add x19, x0, x2 // relocated Image entrypoint mov x20, x1 // DTB address + mrs x27, cntvct_el0 /* * Clean the copied Image to the PoC, and ensure it is not shadowed by diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S index 2196aad7b55bcef0..068a7d111836382b 100644 --- a/arch/arm64/kernel/head.S +++ b/arch/arm64/kernel/head.S @@ -430,6 +430,8 @@ SYM_FUNC_START_LOCAL(__primary_switched) str_l x21, __fdt_pointer, x5 // Save FDT pointer + str_l x27, boot_args + 8, x5 + ldr_l x4, kimage_vaddr // Save the offset between sub x4, x4, x0 // the kernel virtual and str_l x4, kimage_voffset, x5 // physical mappings @@ -797,6 +799,10 @@ SYM_FUNC_START_LOCAL(__primary_switch) adrp x1, reserved_pg_dir adrp x2, init_idmap_pg_dir bl __enable_mmu + + mrs x0, cntvct_el0 + sub x27, x0, x27 + #ifdef CONFIG_RELOCATABLE adrp x23, KERNEL_START and x23, x23, MIN_KIMG_ALIGN - 1