On 05.06.20 14:39, Ard Biesheuvel wrote: > On Fri, 5 Jun 2020 at 14:20, Heinrich Schuchardt <xypron.glpk@xxxxxx> wrote: >> >> On 05.06.20 13:52, Ard Biesheuvel wrote: >>> EFI on ARM only supports short descriptors, and given that it mandates >>> that the MMU and caches are on, it is implied that booting in HYP mode >>> is not supported. >>> >>> However, implementations of EFI exist (i.e., U-Boot) that ignore this >>> requirement, which is not entirely unreasonable, given that it does >>> not make a lot of sense to begin with. >> >> Hello Ard, >> >> thanks for investigating the differences between EDK2 and U-Boot. >> >> What I still do not understand is if there is a bug in U-Boot where >> U-Boot does not conform to the UEFI specification? In this case I would >> expect a fix in U-Boot. >> > > U-Boot violates the EFI spec, yes. The spec is very clear on how the > MMU is configured, and this rules out booting with the caches off, or > booting in HYP mode. > > However, given that this is the situation today, we still need to deal > with it on the Linux side. > In parallel, fixing it in U-boot may be appropriate. However, I think > the EFI spec is too detailed here - instead of 'booting at the highest > non-secure privilege mode', it tells you which exact bits to set in > SCTLR, which seems overzealous to me. In other words, even though it > violates the letter of the spec, I don't mind dealing with this > exception in the Linux side, since the requirement is somewhat > unreasonable to begin with. > >> Or is it simply a deficiency of Linux that it does not properly support >> HYP/EL2 mode on 32-bit ARM? >> > > No, this is definitely not the fault of Linux. > >> Up to now I never experience a problem booting a 32bit board via U-Boot >> -> GRUB-EFI -> Linux. Where did you have a problem when booting via >> U-Boot's UEFI implementation and the current Linux kernel? >> > > I haven't managed to make it fail, but my only 32-bit HYP capable > platform is QEMU. I am not 100% convinced that the EFI+HYP+U-Boot case > is rock solid, and I may have gotten lucky with QEMU (which uses > emulation not virtualization when running at HYP) > > Do you have any A7/A15 based boards that don't print 'WARNING: Caches > not enabled' at boot? Hello Ard, I have no board that prints this. Where did you actually see this output? The string "Caches not enabled" does not exist in Linux next-20200505 according to "git grep -n 'ache not enabled'". Here is some sample output for an Orange Pi PC with a quad core Allwinner H3 Soc. "ARMv7 Processor rev 5 (v7l)" according to /proc/cpuinfo, compatible to "arm,cortex-a7" according to the device tree. $ uname -m Linux orangepi-pc 5.6.0-2-armmp-lpae #1 SMP Debian 5.6.14-1 (2020-05-23) armv7l GNU/Linux *Booted via GRUB EFI* load mmc 0:2 $fdt_addr_r /dtb # dtb -> dtbs/5.6.0-2-armmp-lpae/./sun8i-h3-orangepi-pc.dtb load mmc 0:1 $kernel_addr_r /EFI/debian/grubarm.efi bootefi $kernel_addr_r $fdt_addr_r EFI stub: Booting Linux Kernel... EFI stub: ERROR: Could not determine UEFI Secure Boot status. EFI stub: Using DTB from configuration table EFI stub: Exiting boot services and installing virtual address map... EFI stub: ERROR: Could not determine UEFI Secure Boot status $ sudo dmesg -H | grep ache [ +0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache [ +0.000000] Memory policy: Data cache writealloc [ +0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes, linear) [ +0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes, linear) [ +0.000000] random: get_random_u32 called from __kmem_cache_create+0x48/0x518 with crng_init=0 [ +0.000111] Mount-cache hash table entries: 2048 (order: 1, 8192 bytes, linear) [ +0.000031] Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes, linear) [ +0.000126] VFS: Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) [ +0.027188] systemd[1]: Reached target Local Encrypted Volumes. [ +0.023909] systemd[1]: Reached target Paths. [ +0.019948] systemd[1]: Reached target Remote File Systems. [ +0.019927] systemd[1]: Reached target Slices. [ +0.020008] systemd[1]: Reached target Swap. [ +0.005853] lima 1c40000.gpu: l2 cache 64K, 4-way, 64byte cache line, 64bit external bus *Booted via Linux EFI stub* setenv bootargs root=UUID=... ro initrd=initrd.img-5.6.0-2-armmp-lpae load mmc 0:2 $fdt_addr_r dtb # dtb -> dtbs/5.6.0-2-armmp-lpae/./sun8i-h3-orangepi-pc.dtb load mmc 0:2 $kernel_addr_r vmlinuz-5.6.0-2-armmp-lpae bootefi $kernel_addr_r $fdt_addr_r EFI stub: Booting Linux Kernel... EFI stub: ERROR: Could not determine UEFI Secure Boot status. EFI stub: Using DTB from configuration table EFI stub: Exiting boot services and installing virtual address map... EFI stub: ERROR: Could not determine UEFI Secure Boot status. $ sudo dmesg -H | grep ache [ +0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache [ +0.000000] Memory policy: Data cache writealloc [ +0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes, linear) [ +0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes, linear) [ +0.000000] random: get_random_u32 called from __kmem_cache_create+0x48/0x518 with crng_init=0 [ +0.000114] Mount-cache hash table entries: 2048 (order: 1, 8192 bytes, linear) [ +0.000033] Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes, linear) [ +0.000128] VFS: Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) [ +0.027122] systemd[1]: Reached target Local Encrypted Volumes. [ +0.023916] systemd[1]: Reached target Paths. [ +0.019869] systemd[1]: Reached target Remote File Systems. [ +0.019974] systemd[1]: Reached target Slices. [ +0.020017] systemd[1]: Reached target Swap. [ +0.017564] lima 1c40000.gpu: l2 cache 64K, 4-way, 64byte cache line, 64bit external bus Best regards Heinrich > >> Does this patch relate to the retirement of KVM on 32 ARM in Linux 5.7 >> 541ad0150ca4 ("arm: Remove 32bit KVM host support")? Without that patch >> I would have expected Linux to work fine in HYP mode. >> > > Not really. The code still has to deal with booting at HYP mode, > regardless of whether it is used in a useful way. > > I suppose simply dropping to SVC in the decompressor might make sense > as well, given that booting the kernel at HYP doesn't buy you anything > anymore in the first place. Marc may have some thoughts on that, but > it is really a separate discussion. > > >> >>> >>> So let's make sure that we can deal with this condition gracefully. >>> We already tolerate booting the EFI stub with the caches off (even >>> though this violates the EFI spec as well), and so we should deal >>> with HYP mode boot with MMU and caches either on or off. >>> >>> - When the MMU and caches are on, we can ignore the HYP stub altogether, >>> since we can just use the existing mappings, and just branch into >>> the decompressed kernel as usual after disabling MMU and caches. >>> >>> - When the MMU and caches are off, we have to drop to SVC mode so that >>> we can set up the page tables using short descriptors. In this case, >>> we need to install the HYP stub so that we can return to HYP mode >>> when handing over to the kernel proper. >>> >>> Signed-off-by: Ard Biesheuvel <ardb@xxxxxxxxxx> >>> --- >>> arch/arm/boot/compressed/head.S | 29 +++++++++++++++++++++++++++++ >>> 1 file changed, 29 insertions(+) >>> >>> diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S >>> index c79db44ba128..986db86ba334 100644 >>> --- a/arch/arm/boot/compressed/head.S >>> +++ b/arch/arm/boot/compressed/head.S >>> @@ -1436,6 +1436,35 @@ ENTRY(efi_enter_kernel) >>> mrc p15, 0, r0, c1, c0, 0 @ read SCTLR >>> tst r0, #0x1 @ MMU enabled? >>> orreq r4, r4, #1 @ set LSB if not >>> +#ifdef CONFIG_ARM_VIRT_EXT >>> + @ >>> + @ The EFI spec does not support booting on ARM in HYP mode, >>> + @ since it mandates that the MMU and caches are on, with all >>> + @ 32-bit addressable DRAM mapped 1:1 using short descriptors. >>> + @ While the EDK2 reference implementation adheres to this, >>> + @ U-Boot might decide to enter the EFI stub in HYP mode anyway, >>> + @ with the MMU and caches either on or off. >>> + @ In the former case, we're better off just carrying on using >>> + @ the cached 1:1 mapping that the firmware provided, and pretend >>> + @ that we are in SVC mode as far as the decompressor is >>> + @ concerned. However, if the caches are off, we need to drop >>> + @ into SVC mode now, and let the decompressor set up its cached >>> + @ 1:1 mapping as usual. >>> + @ >>> + mov r0, #SVC_MODE >>> + msr spsr_cxsf, r0 @ record that we are in SVC mode >>> + bne 1f @ skip HYP stub install if MMU is on >>> + >>> + mov r9, r4 @ preserve image base >>> + bl __hyp_stub_install @ returns boot mode in r4 >>> + cmp r4, #HYP_MODE @ did we boot in HYP? >>> + bne 1f @ skip drop to SVC if we did not >>> + >>> + safe_svcmode_maskall r0 >>> + msr spsr_cxsf, r4 @ record boot mode >>> + mov r4, r9 @ restore image base >>> +1: >>> +#endif >>> >>> mov r0, r8 @ DT start >>> add r1, r8, r2 @ DT end >>> >>