On Sun, 20 Sep 2020, Ard Biesheuvel wrote: > On Sun, 20 Sep 2020 at 10:57, Russell King - ARM Linux admin > <linux@xxxxxxxxxxxxxxx> wrote: > > > > On Sun, Sep 20, 2020 at 09:50:30AM +0200, Ard Biesheuvel wrote: > > > On Sun, 20 Sep 2020 at 01:49, Nicolas Pitre <nico@xxxxxxxxxxx> wrote: > > > > > > > > On Fri, 18 Sep 2020, Ard Biesheuvel wrote: > > > > > > > > > This series is inspired by Zhei Len's series [0], which updates the > > > > > ARM p2v patching code to optionally support p2v relative alignments > > > > > of as little as 64 KiB. > > > > > > > > > > Reducing this alignment is necessary for some specific Huawei boards, > > > > > but given that reducing this minimum alignment will make the boot > > > > > sequence more robust for all platforms, especially EFI boot, which > > > > > no longer relies on the 128 MB masking of the decompressor load address, > > > > > but uses firmware memory allocation routines to find a suitable spot > > > > > for the decompressed kernel. > > > > > > > > > > This series is not based on Zhei Len's code, but addresses the same > > > > > problem, and takes some feedback given in the review into account: > > > > > - use of a MOVW instruction to avoid two adds/adcs sequences when dealing > > > > > with the carry on LPAE > > > > > - add support for Thumb2 kernels as well > > > > > - make the change unconditional - it will bit rot otherwise, and has value > > > > > for other platforms as well. > > > > > > > > > > The first four patches are general cleanup and preparatory changes. > > > > > Patch #5 implements the switch to a MOVW instruction without changing > > > > > the minimum alignment. > > > > > Patch #6 reduces the minimum alignment to 2 MiB. > > > > > > > > > > Tested on QEMU in ARM/!LPAE, ARM/LPAE, Thumb2/!LPAE and Thumb2/LPAE modes. > > > > > > > > At this point I think this really ought to be split into a file of its > > > > own... and maybe even rewritten in C. Even though I wrote the original > > > > code, I no longer understand it without re-investing time into it. But > > > > in either cases the whole of head.S would need to have its registers > > > > shuffled first to move long lived values away from r0-r3,ip,lr to allow > > > > for standard function calls. > > > > > > > > > > I agree with that in principle, however, running C code with a stack > > > with the MMU off is slightly risky. > > > > It's more than "slightly". C code has literal addresses, which are raw > > virtual addresses. These are meaningless with the MMU off. > > > > I guess one could correct the various pointers the code would read, but > > you could not directly access any variable (as that involves > > dereferencing a virtual address stored in the function's literal pool.) > > > > We might be able to work around that by compiling with -fPIC, and/or > by ensuring that all inputs to the routine are passed via function > parameters. But I agree that using C for this code is probably not the > right choice. Yeah... It is doable, like we do in the decompressor case, but the level of caution needed here would probably negate the gain from writing this in C. The argument for moving it out to a file of its own still stands though. > If there is no disagreement about the 2 MiB alignment, or the choice > of opcodes for the patchable sequences, I can prepare a v2 that fixes > the issues I mentioned, and has some more explanatory comments in the > patching routine. Yes please. Given you do have it all in your head now, it would be very valuable to be way more expensive with comments. Adding a comment block with the opcode bit definition before the code that transforms them would also be nice. Nicolas