While making changes to the EFI stub startup code, I noticed that we are still doing set/way maintenance on the caches when booting on v7 cores. This works today on VMs by virtue of the fact that KVM traps set/way ops and cleans the whole address space by VA on behalf of the guest, and on most v7 hardware, the set/way ops are in fact sufficient when only one core is running, as there usually is no system cache. But on systems like SynQuacer, for which 32-bit firmware is available, the current cache maintenance only pushes the data out to the L3 system cache, where it is not visible to the CPU once it turns the MMU and caches off. So instead, switch to the by-VA cache maintenance that the architecture requires for v7 and later (and ARM1176, as a side effect). Changes since v2: - add a patch to factor out the code sequence that obtains the inflated image size by doing an unaligned LE32 load from the end of the compressed data - use new macro to load the inflated image size instead of doing a potentially unaligned load - omit the stack for getting the base and size of the self-relocated zImage Changes since v1: - include the EFI patch that was sent out separately before (#1) - split the preparatory work to pass the region to clean in r0/r1 in a EFI specific one and one for the decompressor - this way, the first two patches can go on a stable branch that is shared between the ARM tree and the EFI tree - document the meaning of the values in r0/r1 upon entry to cache_clean_flush - take care to treat the region end address as exclusive - switch to clean+invalidate to align with the other implementations - drop some code that manages the stack pointer value before calling cache_clean_flush(), which is no longer necessary - take care to clean the entire region that is covered by the relocated zImage if it needs to relocate itself before decompressing https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm32-efi-cache-ops [ Several people asked me offline why on earth I am running SynQuacer on 32 bit: the answer is that this is simply to prove that it is currently broken, and this implies that for 32-bit VMs running under KVM, we are relying on the special, non-architectural cache management done by the hypervisor on behalf of the guest to be able to run this code. ] Cc: Russell King <linux@xxxxxxxxxxxxxxx> Cc: Marc Zyngier <maz@xxxxxxxxxx> Cc: Nicolas Pitre <nico@xxxxxxxxxxx> Cc: Catalin Marinas <catalin.marinas@xxxxxxx> Cc: Tony Lindgren <tony@xxxxxxxxxxx> Cc: Linus Walleij <linus.walleij@xxxxxxxxxx> Ard Biesheuvel (5): efi/arm: Work around missing cache maintenance in decompressor handover efi/arm: Pass start and end addresses to cache_clean_flush() ARM: decompressor: factor out routine to obtain the inflated image size ARM: decompressor: prepare cache_clean_flush for doing by-VA maintenance ARM: decompressor: switch to by-VA cache maintenance for v7 cores arch/arm/boot/compressed/head.S | 166 +++++++++++--------- 1 file changed, 91 insertions(+), 75 deletions(-) -- 2.17.1