On Sun, 20 Sep 2020 at 10:57, Russell King - ARM Linux admin <linux@xxxxxxxxxxxxxxx> wrote: > > On Sun, Sep 20, 2020 at 09:50:30AM +0200, Ard Biesheuvel wrote: > > On Sun, 20 Sep 2020 at 01:49, Nicolas Pitre <nico@xxxxxxxxxxx> wrote: > > > > > > On Fri, 18 Sep 2020, Ard Biesheuvel wrote: > > > > > > > This series is inspired by Zhei Len's series [0], which updates the > > > > ARM p2v patching code to optionally support p2v relative alignments > > > > of as little as 64 KiB. > > > > > > > > Reducing this alignment is necessary for some specific Huawei boards, > > > > but given that reducing this minimum alignment will make the boot > > > > sequence more robust for all platforms, especially EFI boot, which > > > > no longer relies on the 128 MB masking of the decompressor load address, > > > > but uses firmware memory allocation routines to find a suitable spot > > > > for the decompressed kernel. > > > > > > > > This series is not based on Zhei Len's code, but addresses the same > > > > problem, and takes some feedback given in the review into account: > > > > - use of a MOVW instruction to avoid two adds/adcs sequences when dealing > > > > with the carry on LPAE > > > > - add support for Thumb2 kernels as well > > > > - make the change unconditional - it will bit rot otherwise, and has value > > > > for other platforms as well. > > > > > > > > The first four patches are general cleanup and preparatory changes. > > > > Patch #5 implements the switch to a MOVW instruction without changing > > > > the minimum alignment. > > > > Patch #6 reduces the minimum alignment to 2 MiB. > > > > > > > > Tested on QEMU in ARM/!LPAE, ARM/LPAE, Thumb2/!LPAE and Thumb2/LPAE modes. > > > > > > At this point I think this really ought to be split into a file of its > > > own... and maybe even rewritten in C. Even though I wrote the original > > > code, I no longer understand it without re-investing time into it. But > > > in either cases the whole of head.S would need to have its registers > > > shuffled first to move long lived values away from r0-r3,ip,lr to allow > > > for standard function calls. > > > > > > > I agree with that in principle, however, running C code with a stack > > with the MMU off is slightly risky. > > It's more than "slightly". C code has literal addresses, which are raw > virtual addresses. These are meaningless with the MMU off. > > I guess one could correct the various pointers the code would read, but > you could not directly access any variable (as that involves > dereferencing a virtual address stored in the function's literal pool.) > We might be able to work around that by compiling with -fPIC, and/or by ensuring that all inputs to the routine are passed via function parameters. But I agree that using C for this code is probably not the right choice. If there is no disagreement about the 2 MiB alignment, or the choice of opcodes for the patchable sequences, I can prepare a v2 that fixes the issues I mentioned, and has some more explanatory comments in the patching routine.