On Thu, 31 Mar 2022 at 17:37, Mark Rutland <mark.rutland@xxxxxxx> wrote: > > Hi Ard, > Hi Mark, Thanks for taking a look. > On Wed, Mar 30, 2022 at 05:41:47PM +0200, Ard Biesheuvel wrote: > > This is a followup to a previous series of mine [0], and it aims to > > streamline the boot flow with respect to cache maintenance and redundant > > copying of data in memory. > > > > Combined with my proof-of-concept firmware for QEMU/arm64 [1], this > > results in a boot where both the kernel and the initrd are loaded > > straight to their final locations in memory, while the physical > > placement of the kernel image is still randomized by the loader. It also > > removes all memory accesses performed with the MMU and caches off > > (except for instruction fetches) that are done from the moment the VM > > comes out of reset. > > > > On the kernel side, this comes down to: > > - increasing the ID map to cover the entire kernel image, so we can > > build the kernel page tables with the MMU and caches enabled; > > - deal with the MMU already being on at boot, and keep it on while > > building the ID map; > > - ensure all stores to memory that are now done with the MMU and caches > > on are not negated by the subsequent cache invalidation. > > This is on my queue to review in detail, but for now I have a couple of > high-level thoughts: > > 1) I like the idea of deferring/staging some work until after the MMU is on, > and I'm in favour of doing so where we can do so in all cases. If we end up > with infrastructure to run some MMU-on TTBR0 stub environment(s), that could > be useful elsewhere, e.g. idmap_kpti_install_ng_mappings(). > Yeah, good point. I as aware that there might be other code that we would prefer to run in the same way. > 2) I do not think that we should support entering the kernel with the MMU on. > > I think that consistently using the same MMU-off boot code has saved us a > great deal of pain thus far, and the more I think about booting with the MMU > on, I think it opens us up to a lot of potential pain, both in the short term > and longer term as the architecture evolves. For example, as rhetoricals > from the top of my head: > > * How do we safely inherit whatever VMSA state the loader has left us with? > e.g. what do we require w.r.t. TCRs, MAIRS? > e.g. what to do when the loader uses a different granule size from the > kernel? > > * What can we expect is mapped, and with which specific attributes and > permissions? > > * What do we document here for loaders other than the EFI stub? > ... and what about kexec? > The only requirement is that the entire image is mapped writeback cacheable, with the code region executable and the data region writable. Beyond that, it doesn't really matter, not even whether we boot at EL2 or EL1. The 1:1 mapping we inherit from the previous boot stage is only used to create the ID map (and to set some global variables), we never run our page tables under the old TCR/MAIR regime or vice versa. Whether or not we should relax the documented boot protocol as well is a separate question. I wouldn't be opposed to doing that, if we document the requirements, but it is not something I'm pursuing with this series. > ... and generally this is another complication for maintenance and testing > that I'd rather not open the door to. > > In other words, my view is that we should *minimize* what we do with the MMU > off, but only where we can do that consistently, and we should still > consistently enter with the MMU off such that we can consistently and safely > initialize the VMSA state. > I see your point. I personally think this is manageable, but I'll let the maintainers be the judge of that. Thanks, Ard.