On Tue, Nov 15, 2022 at 12:21:55PM +0100, Ard Biesheuvel wrote: > On Tue, 15 Nov 2022 at 12:17, Will Deacon <will@xxxxxxxxxx> wrote: > > > > On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote: > > > On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote: > > > > The purpose of this series is to remove any explicit cache maintenance > > > > for coherency during early boot that becomes unnecessary if we simply > > > > retain the cacheable 1:1 mapping of all of system RAM provided by EFI, > > > > and use it to populate the ID map page tables. After setting up this > > > > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR, > > > > TCR and SCTLR registers as before, and proceed as usual, avoiding the > > > > need for any manipulations of memory while the MMU and caches are off. > > > > > > > > The only properties of the firmware provided 1:1 map we rely on is that > > > > it does not require any explicit cache maintenance for coherency, and > > > > that it covers the entire memory footprint of the image, including the > > > > BSS and padding at the end - all else is under control of the kernel > > > > itself, as before. > > > > > > As a high-level thing, I'm still very much not keen on entering the kernel with > > > the MMU on. Given that we have to support booting with the MMU off for !EFI > > > boot (including kexec when EFI is in use), I think this makes it harder to > > > reason about the boot code overall (e.g. due to the conditional maintenance > > > added to head.S), and adds more scope for error, even if it simplifies the EFI > > > stub itself. > > > > As discussed offline, two things that would help the current series are: > > > > (1) Some performance numbers comparing MMU off vs MMU on boot > > > > (2) Use of a separate entry point for the MMU on case, potentially failing > > the boot if the MMU is on and we're not using EFI > > > > Ack. > > But thinking about (2) again, failing the boot is better done at a > time when you can inform the user about it, no? > > IOW, just going into a deadloop really early if you enter the bare > metal entry point with the MMU on is going to be hard to distinguish > from other issues, whereas panicking after the console up is more > likely to help getting the actual issue diagnosed. Agreed. > So perhaps we should panic() instead of warn+taint when this condition > occurs, and do it from an early initcall instead of from setup_arch(). To be honest, and I appreciate that this is unhelpful, but I'm fine with the warn+taint and prefer that to a fatal stop. Will