On Mon, 5 Sept 2022 at 09:25, Huacai Chen <chenhuacai@xxxxxxxxxx> wrote: > > Hi, Ard and Youling, > > On Mon, Sep 5, 2022 at 3:02 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > On Mon, 5 Sept 2022 at 05:51, Huacai Chen <chenhuacai@xxxxxxxxxx> wrote: > > > > > > Hi, Ard, > > > > > > On Mon, Sep 5, 2022 at 5:59 AM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > > > > > On Sun, 4 Sept 2022 at 15:24, Huacai Chen <chenhuacai@xxxxxxxxxx> wrote: > > > > > > > > > > Hi, Ard, > > > > > > > > > > On Thu, Sep 1, 2022 at 6:40 PM Huacai Chen <chenhuacai@xxxxxxxxxx> wrote: > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > On Sat, Aug 27, 2022 at 3:14 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > On Sat, 27 Aug 2022 at 06:41, Xi Ruoyao <xry111@xxxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > Tested V3 with the magic number check manually removed in my GRUB build. > > > > > > > > The system boots successfully. I've not tested Arnd's zBoot patch yet. > > > > > > > > > > > > > > I am Ard not Arnd :-) > > > > > > > > > > > > > > Please use this branch when testing the EFI decompressor: > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=efi-decompressor-v4 > > > > > > The root cause of LoongArch zboot boot failure has been found, it is a > > > > > > binutils bug, latest toolchain with the below patch can solve the > > > > > > problem. > > > > > > > > > > > > diff --git a/bfd/elfnn-loongarch.c b/bfd/elfnn-loongarch.c > > > > > > index 5b44901b9e0..fafdc7c7458 100644 > > > > > > --- a/bfd/elfnn-loongarch.c > > > > > > +++ b/bfd/elfnn-loongarch.c > > > > > > @@ -2341,9 +2341,10 @@ loongarch_elf_relocate_section (bfd > > > > > > *output_bfd, struct bfd_link_info *info, > > > > > > case R_LARCH_SOP_PUSH_PLT_PCREL: > > > > > > unresolved_reloc = false; > > > > > > > > > > > > - if (resolved_to_const) > > > > > > + if (!is_undefweak && resolved_to_const) > > > > > > { > > > > > > relocation += rel->r_addend; > > > > > > + relocation -= pc; > > > > > > break; > > > > > > } > > > > > > else if (is_undefweak) > > > > > > > > > > > > > > > > > > Huacai > > > > > Now the patch is submitted here: > > > > > https://sourceware.org/pipermail/binutils/2022-September/122713.html > > > > > > > > > > > > > Great. Given the severity of this bug, I imagine that building the > > > > LoongArch kernel will require a version of binutils that carries this > > > > fix. > > > > > > > > Therefore, i will revert back to the original approach for accessing > > > > uncompressed_size, using an extern declaration with an __aligned(1) > > > > attribute. > > > > > > > > > And I have some other questions about kexec: kexec should jump to the > > > > > elf entry or the pe entry? I think is the elf entry, because if we > > > > > jump to the pe entry, then SVAM will be executed twice (but it should > > > > > be executed only once). However, how can we jump to the elf entry if > > > > > we use zboot? Maybe it is kexec-tool's responsibility to decompress > > > > > the zboot kernel image? > > > > > > > > > > > > > Yes, very good point. Kexec kernels cannot boot via the EFI entry > > > > point, as the boot services will already be shutdown. So the kexec > > > > kernel needs to boot via the same entrypoint in the core kernel that > > > > the EFI stub calls when it hands over. > > > > > > > > For the EFI zboot image in particular, we will need to teach kexec how > > > > to decompress them. The zboot image has a header that > > > > a) describes it as a EFI linux zimg > > > > b) describes the start and end offset of the compressed payload > > > > c) describes which compression algorithm was used. > > > > > > > > This means that any non-EFI loader (including kexec) should be able to > > > > extract the inner PE/COFF image and decompress it. For arm64 and > > > > RISC-V, this is sufficient as the EFI and raw images are the same. For > > > > LoongArch, I suppose it means we need a way to enter the core kernel > > > > directly via the entrypoint that the EFI stub uses when handing over > > > > (and pass the original DT argument so the kexec kernel has access to > > > > the EFI and ACPI firmware tables) > > > OK, then is this implementation [1] acceptable? I remember that you > > > said the MS-DOS header shouldn't contain other information, so I guess > > > this is unacceptable? > > > > > > > No, this looks reasonable to me. I objected to using magic numbers in > > the 'pure PE' view of the image, as it does not make sense for a pure > > PE loader such as GRUB to rely on such metadata. > > > > In this case (like on arm64), we are dealing with something else: we > > need to identify the image to the kernel itself, and here, using the > > unused space in the MS-DOS header is fine. > > > > > [1] https://lore.kernel.org/loongarch/c4dbb14a-5580-1e47-3d15-5d2079e88404@xxxxxxxxxxx/T/#mb8c1dc44f7fa2d3ef638877f0cd3f958f0be96ad > OK, then there is no big problem here. And I found that arm64/riscv > don't need the kernel entry point in the header. I don't know why, but > I think it implies that a unified layout across architectures is > unnecessary, and I prefer to put the kernel entry point before > effective kernel size. :) > It is fine to put the entry point offset in the header. arm64 and RISC-V don't need this because the first instructions are a pseudo-NOP (an instruction that does nothing but its binary encoding looks like 'MZ..') and a jump to the actual entry point.