Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot

Will Deacon <will@xxxxxxxxxx> · Tue, 15 Nov 2022 11:31:35 +0000

On Tue, Nov 15, 2022 at 12:21:55PM +0100, Ard Biesheuvel wrote:
> On Tue, 15 Nov 2022 at 12:17, Will Deacon <will@xxxxxxxxxx> wrote:
> >
> > On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote:
> > > On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> > > > The purpose of this series is to remove any explicit cache maintenance
> > > > for coherency during early boot that becomes unnecessary if we simply
> > > > retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> > > > and use it to populate the ID map page tables. After setting up this
> > > > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> > > > TCR and SCTLR registers as before, and proceed as usual, avoiding the
> > > > need for any manipulations of memory while the MMU and caches are off.
> > > >
> > > > The only properties of the firmware provided 1:1 map we rely on is that
> > > > it does not require any explicit cache maintenance for coherency, and
> > > > that it covers the entire memory footprint of the image, including the
> > > > BSS and padding at the end - all else is under control of the kernel
> > > > itself, as before.
> > >
> > > As a high-level thing, I'm still very much not keen on entering the kernel with
> > > the MMU on. Given that we have to support booting with the MMU off for !EFI
> > > boot (including kexec when EFI is in use), I think this makes it harder to
> > > reason about the boot code overall (e.g. due to the conditional maintenance
> > > added to head.S), and adds more scope for error, even if it simplifies the EFI
> > > stub itself.
> >
> > As discussed offline, two things that would help the current series are:
> >
> >   (1) Some performance numbers comparing MMU off vs MMU on boot
> >
> >   (2) Use of a separate entry point for the MMU on case, potentially failing
> >       the boot if the MMU is on and we're not using EFI
> >
> 
> Ack.
> 
> But thinking about (2) again, failing the boot is better done at a
> time when you can inform the user about it, no?
> 
> IOW, just going into a deadloop really early if you enter the bare
> metal entry point with the MMU on is going to be hard to distinguish
> from other issues, whereas panicking after the console up is more
> likely to help getting the actual issue diagnosed.

Agreed.

> So perhaps we should panic() instead of warn+taint when this condition
> occurs, and do it from an early initcall instead of from setup_arch().

To be honest, and I appreciate that this is unhelpful, but I'm fine with
the warn+taint and prefer that to a fatal stop.

Will