On Mon, 19 Sept 2022 at 17:09, Huacai Chen <chenhuacai@xxxxxxxxxx> wrote: > > On Mon, Sep 19, 2022 at 10:44 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > On Mon, 19 Sept 2022 at 16:43, Huacai Chen <chenhuacai@xxxxxxxxxx> wrote: > > > > > > On Mon, Sep 19, 2022 at 10:32 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > > > > > On Mon, 19 Sept 2022 at 16:25, Huacai Chen <chenhuacai@xxxxxxxxxx> wrote: > > > > > > > > > > Hi, Ard, > > > > > > > > > > On Mon, Sep 19, 2022 at 8:27 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > > > > > > > > > On Mon, 19 Sept 2022 at 14:15, Huacai Chen <chenhuacai@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 8:11 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 13:58, Huacai Chen <chenhuacai@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed. > > > > > > > > > > > > > > > > > > > The old way (so-called old world): > > > > > > > > > > > > > > > > > > > a0=argc, a1=argv, a1=bpi > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The current way (so-called new world): > > > > > > > > > > > > > > > > > > > a0=efi boot flag, a1=fdt pointer > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The new way (in this patchset): > > > > > > > > > > > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons: > > > > > > > > > > > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method; > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and > > > > > > > > > > > > > > > > > > non-EFI boot. We *really* should keep these separate, given the > > > > > > > > > > > > > > > > > > experience on ARM, where other projects invent ways to pass those > > > > > > > > > > > > > > > > > > values to the kernel without going through the stub. > > > > > > > > > > > > > > > > > In the last patch I see: > > > > > > > > > > > > > > > > > + void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > > > > > > > > > > > > > > > > > + > > > > > > > > > > > > > > > > > + early_init_dt_scan(fdt_ptr); > > > > > > > > > > > > > > > > > + early_init_fdt_reserve_self(); > > > > > > > > > > > > > > > > > + > > > > > > > > > > > > > > > > > clear_bit(EFI_BOOT, &efi.flags); > > > > > > > > > > > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt > > > > > > > > > > > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI > > > > > > > > > > > > > > > > > system, but similar. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so > > > > > > > > > > > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Do you intend to support non-EFI DT boot by the way? > > > > > > > > > > > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag, > > > > > > > > > > > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2, > > > > > > > > > > > > > > > which looks similar to the old-world). > > > > > > > > > > > > > > > > > > > > > > > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that > > > > > > > > > > > > > > deals with that. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > But I have another question: is > > > > > > > > > > > > > > > it early enough to get DT from systemtable for DT boot (in the current > > > > > > > > > > > > > > > way DT is the earliest thing)? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If you rely on DT only for the hardware description, then loading it > > > > > > > > > > > > > > from efi_init() should be fine. > > > > > > > > > > > > > OK, then please drop patch #12 at this time. It can be added when we > > > > > > > > > > > > > add Loongson-2K support. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > OK > > > > > > > > > > > > > > > > > > > > > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given > > > > > > > > > > > > > > > > that we cannot make use of the runtime services that way. So in my > > > > > > > > > > > > > > > > code, efi_novamap is set to false unconditionally. > > > > > > > > > > > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already > > > > > > > > > > > > > > > supported "noefi" to disable runtime, I don't want to hack > > > > > > > > > > > > > > > efi_novamap. So please keep the efi boot flag, thanks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind? > > > > > > > > > > > > > I usually use "efi=novamap" in EFI shell to see whether it can work > > > > > > > > > > > > > well without runtime. And I think I will modify this patch these days > > > > > > > > > > > > > because in another thread you said that this series cannot boot. > > > > > > > > > > > > > > > > > > > > > > > > It works fine now. > > > > > > > > > > > > > > > > > > > > > > > > However, clearing the EFI_BOOT flag is not the correct way to disable > > > > > > > > > > > > runtime services only. And note that we also have efi=noruntime - does > > > > > > > > > > > > that currently work on LoongArch? > > > > > > > > > > > OK, but we don't need to add a new "efi=noruntime" parameter, I can > > > > > > > > > > > use "noefi" instead. > > > > > > > > > > > > > > > > > > > > OK > > > > > > > > > > > > > > > > > > > > > But I think support "efi=novamap" is not a bad > > > > > > > > > > > idea (maybe I misunderstood its usage). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > It basically exists to deal with broken EFI firmware on ARM laptops > > > > > > > > > > that were only intended to run Windows. Windows calls > > > > > > > > > > SetVirtualAddressMap() *after* creating the new mappings, and some > > > > > > > > > > broken firmware drivers will access those regions too early. > > > > > > > > > > > > > > > > > > > > On Linux, we install the mapping first, and then much later, we > > > > > > > > > > actually create the mappings and only activate them on a single CPU > > > > > > > > > > while the EFI runtime service call is in progress. > > > > > > > > > > > > > > > > > > > > In any case, I will reinstate the efi_novamap logic for now - we can > > > > > > > > > > always revisit it later. > > > > > > > > > OK, now I think there are no big problems. Only some bikesheddings: > > > > > > > > > 1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass > > > > > > > > > boot information, it looks most similar to the old-world, and we can > > > > > > > > > distinguish old-world/new-world by judging whether a0 is greater than > > > > > > > > > 1. > > > > > > > > > 2, I prefer "early return" in efi_init, i.e., if (boot_memmap == > > > > > > > > > EFI_INVALID_TABLE_ADDR) then return immediately, this makes > > > > > > > > > indentation look better. > > > > > > > > > 3, Declare "cmdline" out of the if() condition in init_environ() looks better. > > > > > > > > > 4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel(). > > > > > > > > > 5, It seems that one line is enough for the last statement in efi_boot_kernel(). > > > > > > > > > > > > > > > > > > > > > > > > > OK > > > > > > > > > > > > > > > > > Hope this series will be stable as soon as possible, our kexec/kdump > > > > > > > > > work needs to adjust because of this change. :) > > > > > > > > > > > > > > > > > > > > > > > > > Do you have a link to those patches? > > > > > > > There is a snapshot for patches pending for 6.1: > > > > > > > https://github.com/loongson/linux/commits/loongarch-next > > > > > > > > > > > > Thanks. I already spotted an issue there which is exactly the kind of > > > > > > thing I am trying to avoid with this series. > > > > > > > > > > > > diff --git a/arch/loongarch/kernel/mem.c b/arch/loongarch/kernel/mem.c > > > > > > index 7423361b0ebc9b..c68c88e40cc0b9 100644 > > > > > > --- a/arch/loongarch/kernel/mem.c > > > > > > +++ b/arch/loongarch/kernel/mem.c > > > > > > @@ -61,4 +62,9 @@ void __init memblock_init(void) > > > > > > > > > > > > /* Reserve the initrd */ > > > > > > reserve_initrd_mem(); > > > > > > + > > > > > > + /* Main reserved memory for the elf core head */ > > > > > > + early_init_fdt_scan_reserved_mem(); > > > > > > + /* Parse linux,usable-memory-range for crash dump kernel */ > > > > > > + early_init_dt_check_for_usable_mem_range(); > > > > > > } > > > > > > > > > > > > So here, we are adding support for DT memory reservations, which kdump > > > > > > apparently needs. > > > > > > > > > > > > However, this parsing of the DT not only happens on kexec boot: all > > > > > > ACPI and DT kernels will now honour FDT memory reservations passed in > > > > > > via a DT when booting the first kernel, and external projects may use > > > > > > this and start to depend on this. > > > > > > > > > > > > Once something like this hits the upstream kernel, it is *very* > > > > > > difficult to change or remove. > > > > > > > > > > > > If you only need to pass the usable memory range for kcrash/kdump, > > > > > > it's probably better to use a command line option for that, instead of > > > > > > relying on DT memory reservations. > > > > > Thank you for your suggestion, but we found some trouble when handle > > > > > initrd in kexec. > > > > > In current implementation, we generate a fdt in kexec-tools, then fill > > > > > "linux,usable-memory-range", "linux,elfcorehdr" and initrd information > > > > > in it. After this series, we can use "mem=xxx" "elfcorehdr=" in > > > > > cmdline, but how to handle initrd information which is stored in a > > > > > system table? > > > > > > > > > > > > > There are two options: > > > > - use initrdmem= on the command line > > > This is not a good way, even if the second kernel handle "initrdmem=", > > > it will conflict with the config table. > > > > > > > It will not conflict - initrdmem= supersedes the INITRD table because > > early param passing happens after efi_init(). > > > > > > - update the INITRD config table in memory (i.e., update the base and > > > > size to refer to the new initrd image) > > > This way seems practicable, but we don't know how to handle it in > > > kexec-tools. :( > > > > > > > Good point. Let me think a bit about this. > OK, then let's go back to this series itself. :) > > I have seen the latest code in your git repo, I don't think we need > efi_disable_rt. Instead, setting/clearing EFI_BOOT according to a0 as > before seems reasonable. > No, not really. EFI_BOOT basically means 'did I boot via EFI?' and strictly, you should not be attempting to access the EFI system table or parse EFI config tables unless EFI_BOOT is true. Deviating from this makes it more difficult for generic code to reason about what parts of EFI are active on a given system. > If efi_novamap breaks something, I can accept "set efi_novamap to > false" in the previous version, or just ignore its value in the > efistub, but a0 should clearly be the indicator of EFI_BOOT. > I'm fine with keeping efi_novamap if you think it has a value. To me, it seems rather pointless, because it prevents you from using the runtime services. If you want a0 to control the EFI boot flag, you should find another way to control whether runtime services can be used, because they are really two different things.