On Thu, Aug 08, 2013 at 10:57:29PM +0100, Roy Franz wrote: > On Wed, Aug 7, 2013 at 11:05 AM, Dave Martin <Dave.Martin@xxxxxxx> wrote: > > On Tue, Aug 06, 2013 at 08:45:12PM -0700, Roy Franz wrote: > >> This patch adds EFI stub support for the ARM Linux kernel. The EFI stub > >> operations similarly to the x86 stub: it is a shim between the EFI firmware > >> and the normal zImage entry point, and sets up the environment that the > >> zImage is expecting. This includes loading the initrd (optionaly) and > >> device tree from the system partition based on the kernel command line. > >> The stub updates the device tree as necessary, including adding reserved > >> memory regions and adding entries for EFI runtime services. The PE/COFF > >> "MZ" header at offset 0 results in the first instruction being an add > >> that corrupts r5, which is not used by the zImage interface. > > > > Some more comments below ... note that I haven't really looked at the C > > code in depth. > > Responses below, and I'm working on incorporating suggested changes > for the next version. I few responses-to-responses from me inline. Your repose supersedes most of this anyhow. Cheers ---Dave > > Thanks, > Roy > > > > > Cheers > > ---Dave > > > >> > >> Signed-off-by: Roy Franz <roy.franz@xxxxxxxxxx> > >> --- > >> arch/arm/boot/compressed/Makefile | 18 +- > >> arch/arm/boot/compressed/efi-header.S | 114 ++++++++ > >> arch/arm/boot/compressed/efi-stub.c | 514 +++++++++++++++++++++++++++++++++ > >> arch/arm/boot/compressed/head.S | 90 +++++- > >> 4 files changed, 728 insertions(+), 8 deletions(-) > >> create mode 100644 arch/arm/boot/compressed/efi-header.S > >> create mode 100644 arch/arm/boot/compressed/efi-stub.c > >> > >> diff --git a/arch/arm/boot/compressed/Makefile b/arch/arm/boot/compressed/Makefile > >> index 7ac1610..c62826a 100644 > >> --- a/arch/arm/boot/compressed/Makefile > >> +++ b/arch/arm/boot/compressed/Makefile > >> @@ -106,8 +106,22 @@ $(addprefix $(obj)/,$(libfdt) $(libfdt_hdrs)): $(obj)/%: $(srctree)/scripts/dtc/ > >> $(addprefix $(obj)/,$(libfdt_objs) atags_to_fdt.o): \ > >> $(addprefix $(obj)/,$(libfdt_hdrs)) > >> > >> +$(addprefix $(obj)/,$(libfdt_objs) efi-stub.o): \ > >> + $(addprefix $(obj)/,$(libfdt_hdrs)) > >> + > > > > Don't we make $(libfdt_objs) depend on $(libfdt_hdrs) twice, now? > > > > Would it make sense just to add efi-stub.o to the list of targets in the > > original rule? > > Yes, change made. > > > >> ifeq ($(CONFIG_ARM_ATAG_DTB_COMPAT),y) > >> -OBJS += $(libfdt_objs) atags_to_fdt.o > >> +OBJS += atags_to_fdt.o > >> +USE_LIBFDT = y > >> +endif > >> + > >> +ifeq ($(CONFIG_EFI_STUB),y) > >> +CFLAGS_efi-stub.o += -DTEXT_OFFSET=$(TEXT_OFFSET) > >> +OBJS += efi-stub.o > >> +USE_LIBFDT = y > >> +endif > >> + > >> +ifeq ($(USE_LIBFDT),y) > >> +OBJS += $(libfdt_objs) > >> endif > >> > >> targets := vmlinux vmlinux.lds \ > >> @@ -125,7 +139,7 @@ ORIG_CFLAGS := $(KBUILD_CFLAGS) > >> KBUILD_CFLAGS = $(subst -pg, , $(ORIG_CFLAGS)) > >> endif > >> > >> -ccflags-y := -fpic -mno-single-pic-base -fno-builtin -I$(obj) > >> +ccflags-y := -fpic -mno-single-pic-base -fno-builtin -I$(obj) -fno-stack-protector > > > > You don't appear to explain this change anywhere. > > Prior to my changes, even though the stack protector was not disabled, > it was not actually used. GCC uses a heuristic > based on the size of the stack whether to enable the stack protector, > and the threshold to trigger its use was not met, so no stack checking > was actually being done. In order to do stack protection, a few > __stack_chk_* functions/variable need to be provided by the > application. I worked a bit on adding these, but could not get them > working in the stub/decompressor. The x86 arch also has > "-fno-stack-protector" defined for its compressed boot stub, so I > decided to go that route as well. > > > > >> asflags-y := -DZIMAGE > >> > >> # Supply kernel BSS size to the decompressor via a linker symbol. > >> diff --git a/arch/arm/boot/compressed/efi-header.S b/arch/arm/boot/compressed/efi-header.S > >> new file mode 100644 > >> index 0000000..6ff32cc > >> --- /dev/null > >> +++ b/arch/arm/boot/compressed/efi-header.S > >> @@ -0,0 +1,114 @@ > >> +@ Copyright (C) 2013 Linaro Ltd; <roy.franz@xxxxxxxxxx> > >> +@ > >> +@ This file contains the PE/COFF header that is part of the > >> +@ EFI stub. > >> +@ > >> + > >> + .org 0x3c > >> + @ > >> + @ The PE header can be anywhere in the file, but for > >> + @ simplicity we keep it together with the MSDOS header > >> + @ The offset to the PE/COFF header needs to be at offset > >> + @ 0x3C in the MSDOS header. > >> + @ The only 2 fields of the MSDOS header that are used are this > >> + @ PE/COFF offset, and the "MZ" bytes at offset 0x0. > >> + @ > >> + .long pe_header @ Offset to the PE header. > > > > Is there any chance of merging this with the equivalent x86 code? > > > > The PE/COFF header is much the same in both cases, although there > > are some differences. Maybe it would be more trouble than it is > > worth... > > I think it would be more pain than gain. We are planning to add arm64 stub > support next, so we'd end up with 4 architectures sharing this assembly file, > which I think would be painful from a patch submission/review point of view. > > > > >> + > >> + .align 3 > >> +pe_header: > >> + > >> + > >> +pe_header: > > > > Duplicate label? > > Yup, fixed. > > > >> + .ascii "PE" > >> + .short 0 > >> + > >> +coff_header: > >> + .short 0x01c2 @ ARM or Thumb > >> + .short 2 @ nr_sections > >> + .long 0 @ TimeDateStamp > >> + .long 0 @ PointerToSymbolTable > >> + .long 1 @ NumberOfSymbols > >> + .short section_table - optional_header @ SizeOfOptionalHeader > >> + .short 0x306 @ Characteristics. > >> + @ IMAGE_FILE_32BIT_MACHINE | > >> + @ IMAGE_FILE_DEBUG_STRIPPED | > >> + @ IMAGE_FILE_EXECUTABLE_IMAGE | > >> + @ IMAGE_FILE_LINE_NUMS_STRIPPED > >> + > >> +optional_header: > >> + .short 0x10b @ PE32 format > >> + .byte 0x02 @ MajorLinkerVersion > >> + .byte 0x14 @ MinorLinkerVersion > >> + > >> + .long 0 @ SizeOfCode > > > > Do we need to fill in SizeOfCode with a real value? It looks like x86 > > does. > > > > We should probably fill this in unless there's a documented ABI for EFI > > boot on ARM which explicitly doesn't require these. > > I will investigate/fix this. > > > > >> + > >> + .long 0 @ SizeOfInitializedData > >> + .long 0 @ SizeOfUninitializedData > >> + > >> + .long efi_stub_entry @ AddressOfEntryPoint > >> + .long efi_stub_entry @ BaseOfCode > >> + .long 0 @ data > >> + > >> +extra_header_fields: > >> + .long 0 @ ImageBase > >> + .long 0x20 @ SectionAlignment > >> + .long 0x20 @ FileAlignment > >> + .short 0 @ MajorOperatingSystemVersion > >> + .short 0 @ MinorOperatingSystemVersion > >> + .short 0 @ MajorImageVersion > >> + .short 0 @ MinorImageVersion > >> + .short 0 @ MajorSubsystemVersion > >> + .short 0 @ MinorSubsystemVersion > >> + .long 0 @ Win32VersionValue > >> + > >> + .long _edata @ SizeOfImage > >> + > >> + @ Everything before the entry point is considered part of the header > >> + .long efi_stub_entry @ SizeOfHeaders > >> + .long 0 @ CheckSum > >> + .short 0xa @ Subsystem (EFI application) > >> + .short 0 @ DllCharacteristics > >> + .long 0 @ SizeOfStackReserve > >> + .long 0 @ SizeOfStackCommit > >> + .long 0 @ SizeOfHeapReserve > >> + .long 0 @ SizeOfHeapCommit > >> + .long 0 @ LoaderFlags > >> + .long 0x0 @ NumberOfRvaAndSizes > >> + > >> + # Section table > >> +section_table: > >> + > >> + # > >> + # The EFI application loader requires a relocation section > >> + # because EFI applications must be relocatable. This is a > >> + # dummy section as far as we are concerned. > >> + # > >> + .ascii ".reloc" > >> + .byte 0 > >> + .byte 0 @ end of 0 padding of section name > >> + .long 0 > >> + .long 0 > >> + .long 0 @ SizeOfRawData > >> + .long 0 @ PointerToRawData > >> + .long 0 @ PointerToRelocations > >> + .long 0 @ PointerToLineNumbers > >> + .short 0 @ NumberOfRelocations > >> + .short 0 @ NumberOfLineNumbers > >> + .long 0x42100040 @ Characteristics (section flags) > >> + > >> + > >> + .ascii ".text" > >> + .byte 0 > >> + .byte 0 > >> + .byte 0 @ end of 0 padding of section name > >> + .long _edata - efi_stub_entry @ VirtualSize > >> + .long efi_stub_entry @ VirtualAddress > >> + .long _edata - efi_stub_entry @ SizeOfRawData > >> + .long efi_stub_entry @ PointerToRawData > >> + > >> + .long 0 @ PointerToRelocations (0 for executables) > >> + .long 0 @ PointerToLineNumbers (0 for executables) > >> + .short 0 @ NumberOfRelocations (0 for executables) > >> + .short 0 @ NumberOfLineNumbers (0 for executables) > >> + .long 0xe0500020 @ Characteristics (section flags) > > > > Can you explain why x86 needs an extra section (the .setup thing)? > > I haven't dug into that in enough detail to understand it yet... > > I will look into that, I don't know off hand. I simplified the header > for ARM as much as I could > for booting with EDK2. > > > >> diff --git a/arch/arm/boot/compressed/efi-stub.c b/arch/arm/boot/compressed/efi-stub.c > >> new file mode 100644 > >> index 0000000..b817ea3 > >> --- /dev/null > >> +++ b/arch/arm/boot/compressed/efi-stub.c > >> @@ -0,0 +1,514 @@ > >> +/* > >> + * linux/arch/arm/boot/compressed/efi-stub.c > >> + * > >> + * Copyright (C) 2013 Linaro Ltd; <roy.franz@xxxxxxxxxx> > >> + * > >> + * This file implements the EFI boot stub for the ARM kernel > >> + * > >> + * This program is free software; you can redistribute it and/or modify > >> + * it under the terms of the GNU General Public License version 2 as > >> + * published by the Free Software Foundation. > >> + * > >> + */ > >> +#include <linux/efi.h> > >> +#include <libfdt.h> > >> + > >> + > >> +/* Error code returned to ASM code instead of valid FDT address. */ > >> +#define EFI_STUB_ERROR (~0) > > > > Can we put that into a suitable hedaer and use it in compressed/head.S, > > instead of the magic 0xffffffff? (Assuming that value is supposed to > > match EFI_STUB_ERROR) > > Yes, I will do this. > > > >> + > >> +/* EFI function call wrappers. These are not required for > >> + * ARM, but wrappers are required for X86 to convert between > >> + * ABIs. These wrappers are provided to allow code sharing > >> + * between X86 and ARM. Since these wrappers directly invoke the > >> + * EFI function pointer, the function pointer type must be properly > >> + * defined, which is not the case for X86 One advantage of this is > >> + * it allows for type checking of arguments, which is not > >> + * possible with the X86 wrappers. > >> + */ > >> +#define efi_call_phys0(f) f() > >> +#define efi_call_phys1(f, a1) f(a1) > >> +#define efi_call_phys2(f, a1, a2) f(a1, a2) > >> +#define efi_call_phys3(f, a1, a2, a3) f(a1, a2, a3) > >> +#define efi_call_phys4(f, a1, a2, a3, a4) f(a1, a2, a3, a4) > >> +#define efi_call_phys5(f, a1, a2, a3, a4, a5) f(a1, a2, a3, a4, a5) > >> + > >> +/* The maximum uncompressed kernel size is 32 MBytes, so we will reserve > >> + * that for the decompressed kernel. We have no easy way to tell what > >> + * the actuall size of code + data the uncompressed kernel will use. > >> + */ > >> +#define MAX_UNCOMP_KERNEL_SIZE 0x02000000 > > > > Can we fish the decompressed data size out of zImage, like the existing > > zImage code does? (see compressed/head.S:207). I don't see why this > > needs to be compile-time constant. > > I am attempting to make sure all the memory used is accounted for in > the EFI memory map, > so I care not only about the uncompressed size, but also the BSS. If > I get the uncompressed > image size, and use that for the allocation, the kernel will overwrite > memory immediately following it. > I had implemented what you suggested and ran into this problem. Hmmm, it looks like I misunderstood what gets appended to the compressed data. However, it looks like the size of the kernel's bss is also made available, via a link-time symbol _kernel_bss_size: KBSS_SZ = $(shell $(CROSS_COMPILE)size $(obj)/../../../../vmlinux | \ awk 'END{print $$3}') LDFLAGS_vmlinux = --defsym _kernel_bss_size=$(KBSS_SZ) You could get at that by extern char _kernel_bss_size; /* ... */ ... (unsigned long)&_kernel_bss_size ... > > > > > Someday, someone may try to grow the kernel image beyond 32M. It would > > be nice to keep the number of things that breaks to a minimum, to ease > > potential pain later. > > I picked 32 MBytes based on some discussions of the boot process, and > my understanding > is that 32 MBytes is a somewhat hard upper limit on kernel size. I guess we can address this one as and when. I suspect that growth beyond 32MB may happen eventually, but it's going to involve a bit of pain whatever. So long as efi_stub barfs if the decompressed kernel + BSS doesn't fit in the available space (you can refer to _kernel_bss_size to check that). > > > > >> + > >> +/* The kernel zImage should be located between 32 Mbytes > >> + * and 128 MBytes from the base of DRAM. The min > >> + * address leaves space for a maximal size uncompressed image, > >> + * and the max address is due to how the zImage decompressor > >> + * picks a destination address. > >> + */ > >> +#define MAX_ZIMAGE_OFFSET 0x08000000 > > > > The maximum zImage offset is actually 1 less than this. I think it's > > just the name of the macro that is misleading, since you use it > > correctly as an upper bound for memory allocation, so far as I can > > see. > > > > Maybe ZIMAGE_OFFSET_LIMIT or something similar would work. > > I'll rename this. > > > > >> +#define MIN_ZIMAGE_OFFSET MAX_UNCOMP_KERNEL_SIZE > >> + > >> +#define MAX_CMDLINE_LEN 500 > > > > This is a random looking number. Is this supposed to match something > > somewhere? Does it serve any purpose other than acting as a sanity > > limit? > > > > If this limit doesn't exist, then an unreasonably large command-line > > passed by EFI would just lead to a memory allocation failure somewhere, > > which feels like the right behaviour... > > > > If we can avoid building in arbitrary limits, it helps avoid surprises > > later. > > > This is just a sanity check, which should be easy to remove. I think > the failure mode will be a huge device tree being created, > rather than an allocation failure. In reality I think the limit will > set by the EFI firmware - I doubt it is possible to pass a > multi-megabyte command line. > > > > >> + > >> +struct fdt_region { > >> + u64 base; > >> + u64 size; > >> +}; > >> + > >> +/* > >> + * Additional size that could be used for FDT entries added by > >> + * the UEFI OS Loader Estimation based on: > >> + * EDID (300bytes) + bootargs + initrd region (20bytes) > >> + * + system memory region (20bytes) + mp_core entries (200 > >> + * bytes) > >> + */ > > > > What does 0x300 have to do with those numbers? > > > > When you say "estimate", are we guaranteed never to exceed that? > > What happens if we do? > > No guarantees, and we fail to boot if we run out of space in the new > device tree. This greatly simplifies the code, > but I agree that it is not that nice. > > > > >> +#define FDT_ADDITIONAL_ENTRIES_SIZE (0x300 + MAX_CMDLINE_LEN) > >> + > >> +/* Include shared EFI stub code */ > >> +#include "../../../../drivers/firmware/efi/efi-stub-helper.c" > >> + > >> + > >> +static int is_linux_reserved_region(int memory_type) > >> +{ > >> + switch (memory_type) { > >> + case EFI_RUNTIME_SERVICES_CODE: > >> + case EFI_RUNTIME_SERVICES_DATA: > >> + case EFI_UNUSABLE_MEMORY: > >> + case EFI_ACPI_RECLAIM_MEMORY: > >> + case EFI_ACPI_MEMORY_NVS: > >> + return 1; > >> + default: > >> + return 0; > >> + } > >> +} > >> + > >> + > >> +static int relocate_kernel(efi_system_table_t *sys_table, > >> + unsigned long *load_addr, unsigned long *load_size, > >> + unsigned long min_addr, unsigned long max_addr) > >> +{ > >> + /* Get current address of kernel. */ > >> + unsigned long cur_zimage_addr = *load_addr; > >> + unsigned long zimage_size = *load_size; > >> + unsigned long new_addr = 0; > >> + unsigned long nr_pages; > >> + > >> + efi_status_t status; > >> + > >> + if (!load_addr || !load_size) > >> + return EFI_INVALID_PARAMETER; > >> + > >> + *load_size = 0; > >> + if (cur_zimage_addr > min_addr > >> + && (cur_zimage_addr + zimage_size) < max_addr) { > >> + /* We don't need to do anything, as kernel at an acceptable > >> + * address already. > >> + */ > >> + return EFI_SUCCESS; > >> + } > >> + /* > >> + * The EFI firmware loader could have placed the kernel image > >> + * anywhere in memory, but the kernel has restrictions on the > >> + * min and max physical address it can run at. > >> + */ > >> + nr_pages = round_up(zimage_size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE; > > > > It looks like nr_pages is never used in this function. > > Yup, removed. > > > >> + > >> + status = efi_low_alloc(sys_table, zimage_size, 0, > >> + &new_addr, min_addr); > >> + if (status != EFI_SUCCESS) { > >> + efi_printk(sys_table, "Failed to alloc memory for kernel.\n"); > > > > Does efi_printk automatically prepend a suitable prefix? If not, > > it might be useful to define a macro to add a standard prefix to all > > efi_printks here ("zImage: " or similar). > > It doesn't, but I can add one. Maybe "EFIstub"? This is really > separate from the zImage boot, so I think > it would be helpful to differentiate it. Sure, just something to disambiguate it. > > > > Minor nit: can we have "allocate" instead of "alloc"? > Sure. > > > > I think both messages should say "failed to allocate usable memory". > > EFI has already allocated memory for the kernel after all: it's > > just in the wrong place initially. > > > >> + return status; > >> + } > >> + > >> + if (new_addr > (max_addr - zimage_size)) { > >> + efi_free(sys_table, zimage_size, new_addr); > >> + efi_printk(sys_table, "Failed to alloc usable memory for kernel.\n"); > >> + return EFI_INVALID_PARAMETER; > >> + } > >> + > >> + /* We know source/dest won't overlap since both memory ranges > >> + * have been allocated by UEFI, so we can safely use memcpy. > >> + */ > >> + memcpy((void *)new_addr, (void *)(unsigned long)cur_zimage_addr, > >> + zimage_size); > > > > Is it possible for this allocation to fail -- i.e., because UEFI has > > put us in an unsuitable location which is within the first 128MB of > > RAM, such that we can't pick a suitable location without overlap? > > I think so, since (in theory at least), other EFI applications could have run > before us and allocated arbitrary amounts of memory. > > > > > For the time being though, I think this is impossible because the > > decompressed Image can't exceed ~32MB (so the zImage should not > > exceed that either, and both can fit inside 128MB. It doesn't > > matter if UEFI's initial load location overlaps the decompressed > > Image). > > The reason I am avoiding the zImage overlapping the decompressed image > even though > the zImage decompressor handles that case is that I want to ensure that > all memory used during early boot is represented in the EFI memory map. > By avoiding overlap, I only have to deal with predicting the final > destination of the > decompressed kernel. I guess that makes sense. If it becomes a constraint, it can be fixed later, but that probably won't happen for a while. > > > > > If UEFI put reserved regions with the first 128MB we're likely to > > be dead anyway, so we shouldn't assume we'll have to cope with that > > for now... > > For these cases I'd like to be able to return an error message and > refuse to boot, rather > than dying during boot. > > In principle, I like the EFI stub being a shim between the EFI > firmware and the normal zImage boot. In practice, > I don't really like having to predict/guess what memory the zImage > decompressor will use so that we can account for that > in the EFI memory map. zImage already suffers from that: you "just have to know" how to arrange the zImage, initramfs and dtb, per board and per bootlodaer. AUTO_ZRELADDR provides some extra flexibility, but there are still arbitrary, unknown constraints which prevent for bootloader from doing the right thing automatically. efi_stub should avoid being worse than that, but if we can have cleaner failures, that's definitely a bonus. > > > > >> + > >> + /* Return the load address and size */ > >> + *load_addr = new_addr; > >> + *load_size = zimage_size; > > > > Is zimage_size ever changed? It looks like it is still equal to the > > initial value of *load_size at this point. > > Nope, I can get rid of zimage_size and just use *load_size throughout. > > > > >> + > >> + > >> + return status; > >> +} > >> + > >> + > >> +/* Convert the unicode UEFI command line to ASCII to pass to kernel. > >> + * Size of memory allocated return in *cmd_line_len. > >> + * Returns NULL on error. > >> + */ > >> +static char *convert_cmdline_to_ascii(efi_system_table_t *sys_table, > >> + efi_loaded_image_t *image, > >> + unsigned long *cmd_line_len, > >> + u32 max_addr) > >> +{ > >> + u16 *s2; > >> + u8 *s1 = NULL; > >> + unsigned long cmdline_addr = 0; > >> + int load_options_size = image->load_options_size / 2; /* ASCII */ > >> + void *options = (u16 *)image->load_options; > >> + int options_size = 0; > >> + int status; > >> + int i; > >> + u16 zero = 0; > >> + > >> + if (options) { > >> + s2 = options; > >> + while (*s2 && *s2 != '\n' && options_size < load_options_size) { > >> + s2++; > >> + options_size++; > >> + } > >> + } > >> + > >> + if (options_size == 0) { > >> + /* No command line options, so return empty string*/ > >> + options_size = 1; > >> + options = &zero; > >> + } > >> + > >> + if (options_size > MAX_CMDLINE_LEN) > >> + options_size = MAX_CMDLINE_LEN; > >> + > >> + options_size++; /* NUL termination */ > > > > Do we care that options_size can now be > load_options_size? > > > > I guess image->load_options isn't realistically going to be right at > > the end of a RAM bank, so probably nothing disastrous will happen if > > we read off the end of it. > > > > It would be tidier to avoid this, though. > > I'll update this to avoid reading past the end of the EFI option string. OK, fine > > > >> + > >> + status = efi_high_alloc(sys_table, options_size, 0, > >> + &cmdline_addr, max_addr); > >> + if (status != EFI_SUCCESS) > >> + return NULL; > >> + > >> + s1 = (u8 *)(unsigned long)cmdline_addr; > >> + s2 = (u16 *)options; > >> + > >> + for (i = 0; i < options_size - 1; i++) > >> + *s1++ = *s2++; > >> + > >> + *s1 = '\0'; > >> + > >> + *cmd_line_len = options_size; > >> + return (char *)(unsigned long)cmdline_addr; > >> +} > >> + > >> +static u32 update_fdt_and_exit_boot(efi_system_table_t *sys_table, > >> + void *handle, unsigned long dram_base, > >> + void *orig_fdt, u64 *orig_fdt_size, > >> + char *cmdline_ptr, > >> + unsigned long *cmdline_size, > >> + u64 initrd_addr, u64 initrd_size) > >> +{ > >> + unsigned long new_fdt_size; > >> + unsigned long new_fdt_addr; > >> + void *fdt; > >> + int node; > >> + int status; > >> + int i; > >> + unsigned long map_size, desc_size; > >> + unsigned long mmap_key; > >> + efi_memory_desc_t *memory_map; > >> + unsigned long fdt_val; > >> + > >> + new_fdt_size = *orig_fdt_size + FDT_ADDITIONAL_ENTRIES_SIZE; > >> + status = efi_high_alloc(sys_table, new_fdt_size, 0, &new_fdt_addr, > >> + dram_base + MAX_ZIMAGE_OFFSET); > >> + if (status != EFI_SUCCESS) { > >> + efi_printk(sys_table, "ERROR: Unable to allocate memory for new device tree.\n"); > >> + goto fail; > >> + } > > > > There are too many error messages in this function (and elsewhere). > > Many of them are only useful for debugging: for real use, the only > > interesting kinds of failure for the DT which will be meaningful to the > > user are "bad device tree" and "out of memory". > > > > Also, it would be desirable to make the error messages more consistent; > > currently we have "Failed to foo", "ERROR: bar", "ERROR moo", "Error baz", > > and more. > > > > We also have "FDT", "fdt", "DTB", "Device Tree", "device tree", all of > > which mean basically the same thing. > > > > You could try wrapping fdt_setprop() with a function which tries to set > > the property and prints a suitable message if it fails, without having > > to put explicit efi_printks all over the place. > > I will review all of the messages, and add a consistent prefix as you > suggested above. OK (I confess to being a bit pedantic here) > > > >> + > >> + > >> + fdt = (void *)new_fdt_addr; > >> + status = fdt_open_into(orig_fdt, fdt, new_fdt_size); > >> + if (status != 0) { > >> + efi_printk(sys_table, "ERROR: Device Tree open_int failed.\n"); > >> + goto fail_free_new_fdt; > >> + } > >> + /* We are done with the original DTB, so free it. */ > >> + efi_free(sys_table, *orig_fdt_size, (u32)orig_fdt); > >> + *orig_fdt_size = 0; > >> + > >> + node = fdt_subnode_offset(fdt, 0, "chosen"); > >> + if (node < 0) { > >> + node = fdt_add_subnode(fdt, 0, "chosen"); > >> + if (node < 0) { > >> + efi_printk(sys_table, "Error on finding 'chosen' node\n"); > >> + goto fail_free_new_fdt; > >> + } > >> + } > >> + > >> + if ((cmdline_ptr != NULL) && (strlen(cmdline_ptr) > 0)) { > >> + status = fdt_setprop(fdt, node, "bootargs", cmdline_ptr, > >> + strlen(cmdline_ptr) + 1); > >> + if (status) { > >> + efi_printk(sys_table, "Failed to set new bootarg\n"); > >> + goto fail_free_new_fdt; > >> + } > >> + } > >> + /* We are done with original command line, so free it. */ > >> + efi_free(sys_table, *cmdline_size, (u32)cmdline_ptr); > >> + *cmdline_size = 0; > >> + > >> + /* Set intird address/end in device tree, if present */ > >> + if (initrd_size != 0) { > >> + u64 initrd_image_end; > >> + u64 initrd_image_start = cpu_to_fdt64(initrd_addr); > >> + status = fdt_setprop(fdt, node, "linux,initrd-start", > >> + &initrd_image_start, sizeof(u64)); > >> + if (status) { > >> + efi_printk(sys_table, "Failed to set new 'linux,initrd-start'\n"); > >> + goto fail_free_new_fdt; > >> + } > >> + initrd_image_end = cpu_to_fdt64(initrd_addr + initrd_size); > >> + status = fdt_setprop(fdt, node, "linux,initrd-end", > >> + &initrd_image_end, sizeof(u64)); > >> + if (status) { > >> + efi_printk(sys_table, "Failed to set new 'linux,initrd-end'\n"); > >> + goto fail_free_new_fdt; > >> + } > >> + } > >> + > >> + /* Update memory map in the device tree. The memory node must > >> + * be present in the tree.*/ > >> + node = fdt_subnode_offset(fdt, 0, "memory"); > >> + if (node < 0) { > >> + efi_printk(sys_table, "ERROR: FDT memory node does not exist in DTB.\n"); > >> + goto fail_free_new_fdt; > >> + } > >> + > >> + status = efi_get_memory_map(sys_table, &memory_map, &map_size, > >> + &desc_size, &mmap_key); > >> + if (status != EFI_SUCCESS) > >> + goto fail_free_new_fdt; > >> + > >> + for (i = 0; i < (map_size / sizeof(efi_memory_desc_t)); i++) { > >> + efi_memory_desc_t *desc; > >> + unsigned long m = (unsigned long)memory_map; > >> + desc = (efi_memory_desc_t *)(m + (i * desc_size)); > >> + > >> + if (is_linux_reserved_region(desc->type)) { > >> + status = fdt_add_mem_rsv(fdt, desc->phys_addr, > >> + desc->num_pages * EFI_PAGE_SIZE); > >> + if (status != 0) { > >> + efi_printk(sys_table, "ERROR: Failed to add 'memreserve' to fdt.\n"); > >> + goto fail_free_mmap; > >> + } > >> + } > >> + } > >> + > >> + > >> + /* Add FDT entries for EFI runtime services in chosen node. > >> + * We need to add the final memory map, so this is done at > >> + * the very end. > >> + */ > >> + node = fdt_subnode_offset(fdt, 0, "chosen"); > >> + fdt_val = cpu_to_fdt32((unsigned long)sys_table); > >> + status = fdt_setprop(fdt, node, "efi-system-table", > >> + &fdt_val, sizeof(fdt_val)); > >> + if (status) { > >> + efi_printk(sys_table, "Failed to set new 'efi-system-table'\n"); > >> + goto fail_free_new_fdt; > >> + } > >> + fdt_val = cpu_to_fdt32(desc_size); > >> + status = fdt_setprop(fdt, node, "efi-mmap-desc-size", > >> + &fdt_val, sizeof(fdt_val)); > >> + if (status) { > >> + efi_printk(sys_table, "Failed to set new 'efi-mmap-desc-size'\n"); > >> + goto fail_free_new_fdt; > >> + } > >> + fdt_val = cpu_to_fdt32(map_size); > >> + status = fdt_setprop(fdt, node, "efi-runtime-mmap-size", > >> + &fdt_val, sizeof(fdt_val)); > >> + if (status) { > >> + efi_printk(sys_table, "Failed to set new 'efi-runtime-mmap-size'\n"); > >> + goto fail_free_new_fdt; > >> + } > >> + fdt_val = cpu_to_fdt32((unsigned long)memory_map); > >> + status = fdt_setprop(fdt, node, "efi-runtime-mmap", > >> + &fdt_val, sizeof(fdt_val)); > >> + if (status) { > >> + efi_printk(sys_table, "Failed to set new 'efi-runtime-mmap'\n"); > >> + goto fail_free_new_fdt; > >> + } > > > > We have one function doing two completely different jobs here (as > > documented by the name). Can it be split? > > I had it split, but due to the address/size pairs that needed to be > passed around > to free the allocated memory on error I combined them. I'll take > another look at it. > I think pulling the allocations out of the function may make this > cleaner, and could > also make the removal of the guessed new FTD size easier to remove. > I'll need to handle > re-trying the FTD allocation in order to gracefully handle significant > growth in the DTB. > > > > > >> + > >> + /* Now we need to exit boot services. We need the key from > >> + * the most recent read of the memory map to do this. We can't > >> + * free this buffer in the normal case, but do free it when > >> + * exit_boot_services() fails or adding the memory map to the FDT > >> + * fails. > >> + */ > >> + status = efi_call_phys2(sys_table->boottime->exit_boot_services, > >> + handle, mmap_key); > >> + > >> + if (status != EFI_SUCCESS) { > >> + efi_printk(sys_table, "exit boot services failed.\n"); > >> + goto fail_free_mmap; > >> + } > >> + > >> + return new_fdt_addr; > >> + > >> +fail_free_mmap: > >> + efi_call_phys1(sys_table->boottime->free_pool, memory_map); > >> + > >> +fail_free_new_fdt: > >> + efi_free(sys_table, new_fdt_size, new_fdt_addr); > >> + > >> +fail: > >> + return 0; > >> +} > >> + > >> + > >> +int efi_entry(void *handle, efi_system_table_t *sys_table, > >> + unsigned long *zimage_addr) > >> +{ > >> + efi_loaded_image_t *image; > >> + int status; > >> + unsigned long nr_pages; > >> + const struct fdt_region *region; > >> + > >> + void *fdt; > >> + int err; > >> + int node; > >> + unsigned long zimage_size = 0; > >> + unsigned long dram_base; > >> + /* addr/point and size pairs for memory management*/ > >> + u64 initrd_addr; > >> + u64 initrd_size = 0; > >> + u64 fdt_addr; /* Original DTB */ > >> + u64 fdt_size = 0; > >> + u64 kernel_reserve_addr; > >> + u64 kernel_reserve_size = 0; > >> + char *cmdline_ptr; > >> + unsigned long cmdline_size = 0; > >> + unsigned long new_fdt_addr; > >> + > >> + efi_guid_t proto = LOADED_IMAGE_PROTOCOL_GUID; > >> + > >> + /* Check if we were booted by the EFI firmware */ > >> + if (sys_table->hdr.signature != EFI_SYSTEM_TABLE_SIGNATURE) > >> + goto fail; > >> + > >> + efi_printk(sys_table, "Booting Linux using EFI stub.\n"); > >> + > >> + > >> + /* get the command line from EFI, using the LOADED_IMAGE protocol */ > >> + status = efi_call_phys3(sys_table->boottime->handle_protocol, > >> + handle, &proto, (void *)&image); > >> + if (status != EFI_SUCCESS) { > >> + efi_printk(sys_table, "Failed to get handle for LOADED_IMAGE_PROTOCOL\n"); > >> + goto fail; > >> + } > >> + > >> + /* We are going to copy this into device tree, so we don't care where in > >> + * memory it is. > >> + */ > >> + cmdline_ptr = convert_cmdline_to_ascii(sys_table, image, > >> + &cmdline_size, 0xFFFFFFFF); > >> + if (!cmdline_ptr) { > >> + efi_printk(sys_table, "ERROR converting command line to ascii.\n"); > >> + goto fail; > >> + } > >> + > >> + /* We first load the device tree, as we need to get the base address of > >> + * DRAM from the device tree. The zImage, device tree, and initrd > >> + * have address restrictions that are relative to the base of DRAM. > >> + */ > >> + status = handle_cmdline_files(sys_table, image, cmdline_ptr, "dtb=", > >> + 0xffffffff, &fdt_addr, &fdt_size); > >> + if (status != EFI_SUCCESS) { > >> + efi_printk(sys_table, "Error loading dtb blob\n"); > >> + goto fail_free_cmdline; > >> + } > >> + > >> + err = fdt_check_header((void *)(unsigned long)fdt_addr); > >> + if (err != 0) { > >> + efi_printk(sys_table, "ERROR: Device Tree header not valid\n"); > >> + goto fail_free_dtb; > >> + } > >> + if (fdt_totalsize((void *)(unsigned long)fdt_addr) > fdt_size) { > >> + efi_printk(sys_table, "ERROR: Incomplete device tree.\n"); > >> + goto fail_free_dtb; > >> + > >> + } > >> + > >> + > >> + /* Look up the base of DRAM from the device tree.*/ > >> + fdt = (void *)(u32)fdt_addr; > >> + node = fdt_subnode_offset(fdt, 0, "memory"); > >> + region = fdt_getprop(fdt, node, "reg", NULL); > >> + if (region) { > >> + dram_base = fdt64_to_cpu(region->base); > >> + } else { > >> + efi_printk(sys_table, "Error: no 'memory' node in device tree.\n"); > >> + goto fail_free_dtb; > >> + } > >> + > >> + /* Reserve memory for the uncompressed kernel image. */ > >> + kernel_reserve_addr = dram_base; > >> + kernel_reserve_size = MAX_UNCOMP_KERNEL_SIZE; > >> + nr_pages = round_up(kernel_reserve_size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE; > >> + status = efi_call_phys4(sys_table->boottime->allocate_pages, > >> + EFI_ALLOCATE_ADDRESS, EFI_LOADER_DATA, > >> + nr_pages, &kernel_reserve_addr); > >> + if (status != EFI_SUCCESS) { > >> + efi_printk(sys_table, "ERROR allocating memory for uncompressed kernel.\n"); > >> + goto fail_free_dtb; > >> + } > >> + > >> + /* Relocate the zImage, if required. */ > >> + zimage_size = image->image_size; > >> + status = relocate_kernel(sys_table, zimage_addr, &zimage_size, > >> + dram_base + MIN_ZIMAGE_OFFSET, > >> + dram_base + MAX_ZIMAGE_OFFSET); > >> + if (status != EFI_SUCCESS) { > >> + efi_printk(sys_table, "Failed to relocate kernel\n"); > >> + goto fail_free_kernel_reserve; > >> + } > >> + > >> + status = handle_cmdline_files(sys_table, image, cmdline_ptr, "initrd=", > >> + dram_base + MAX_ZIMAGE_OFFSET, > >> + &initrd_addr, &initrd_size); > >> + if (status != EFI_SUCCESS) { > >> + efi_printk(sys_table, "Error loading initrd\n"); > >> + goto fail_free_zimage; > >> + } > >> + > >> + new_fdt_addr = update_fdt_and_exit_boot(sys_table, handle, > >> + dram_base, fdt, &fdt_size, > >> + cmdline_ptr, &cmdline_size, > >> + initrd_addr, initrd_size); > >> + > >> + if (new_fdt_addr == 0) { > >> + efi_printk(sys_table, "Error updating device tree and exiting boot services.\n"); > >> + goto fail_free_initrd; > >> + } > > > > Ideally, we shouldn't have one error message for two completely > > different causes. > > > > The printk could move into update_fdt_and_exit_boot() and split > > into more specific cases. > > > >> + > >> + > >> + /* Now we need to return the FDT address to the calling > >> + * assembly to this can be used as part of normal boot. > >> + */ > >> + return new_fdt_addr; > >> + > >> +fail_free_initrd: > >> + efi_free(sys_table, initrd_size, initrd_addr); > >> + > >> +fail_free_zimage: > >> + efi_free(sys_table, zimage_size, *zimage_addr); > >> + > >> +fail_free_kernel_reserve: > >> + efi_free(sys_table, kernel_reserve_addr, kernel_reserve_size); > >> + > >> +fail_free_dtb: > >> + efi_free(sys_table, fdt_size, fdt_addr); > >> + > >> +fail_free_cmdline: > >> + efi_free(sys_table, cmdline_size, (u32)cmdline_ptr); > >> + > >> +fail: > >> + return EFI_STUB_ERROR; > >> +} > >> diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S > >> index 75189f1..491e752 100644 > >> --- a/arch/arm/boot/compressed/head.S > >> +++ b/arch/arm/boot/compressed/head.S > >> @@ -120,21 +120,100 @@ > >> */ > >> .align > >> .arm @ Always enter in ARM state > >> + .text > >> start: > >> .type start,#function > >> - .rept 7 > >> +#ifdef CONFIG_EFI_STUB > >> + @ Magic MSDOS signature for PE/COFF + ADD opcode > >> + .word 0x62805a4d > > > > Did you get a chance to respond to the endianness issue I raised? > For now the EFI stub only supports LE, and I need to update > the Kconfig to reflect this. Adding BE should be possible, but I don't > plan to work on that at this time. OK, so long as that is made explicit in Kconfig, that sounds reasonable. > > > > >> +#else > >> + mov r0, r0 > >> +#endif > >> + .rept 5 > >> mov r0, r0 > >> .endr > >> - ARM( mov r0, r0 ) > >> - ARM( b 1f ) > >> - THUMB( adr r12, BSYM(1f) ) > >> - THUMB( bx r12 ) > >> + > >> + @ zimage_continue will be in ARM or thumb mode as configured > >> + THUMB( adrl r12, BSYM(zimage_continue)) > >> + ARM( adrl r12, zimage_continue) > >> + bx r12 > > > > Note that BSYM() can be used both in ARM and Thumb kernels. > > > > In any case, ARM kernels cannot contain BX instructions because we still > > support ARMv4 (which doesn't have it). > > > > I'm presuming you found zimage_continue is too far away for adr here, > > which is why you changed it. Assuming that't the case, this might make > > sense: > > > > adrl r12, BSYM(zimage_continue) > > ARM( mov pc, r12 ) > > THUMB( bx r12 ) > > Yes, I changed this due to lack of range. > > > > >> + THUMB( .thumb ) > > > > For tidiness, it's better to avoid this dangling .thumb ... move it > > to just before zimage_continue instead, since efi_stub_entry has to be > > ARM anyway. > > OK > > > >> > >> .word 0x016f2818 @ Magic numbers to help the loader > >> .word start @ absolute load/run zImage address > >> .word _edata @ zImage end address > >> + > >> +#ifdef CONFIG_EFI_STUB > >> + @ Portions of the MSDOS file header must be at offset > >> + @ 0x3c from the start of the file. All PE/COFF headers > >> + @ are kept contiguous for simplicity. > >> +#include "efi-header.S" > >> + > >> +efi_stub_entry: > >> + @ The EFI stub entry point is not at a fixed address, however > >> + @ this address must be set in the PE/COFF header. > >> + @ EFI entry point is in A32 mode, switch to T32 if configured. > >> + THUMB( .arm ) > > > > ^So, you can lose .arm here too (but keep the comment -- that's valuable > > info) > > > >> + THUMB( adr r12, BSYM(1f) ) > >> + THUMB( bx r12 ) > >> THUMB( .thumb ) > >> 1: > >> + @ Save lr on stack for possible return to EFI firmware. > >> + @ Don't care about fp, but need 64 bit alignment.... > >> + stmfd sp!, {fp, lr} > >> + > >> + @ Save args to EFI app across got fixup call > >> + stmfd sp!, {r0, r1} > > > > Mostly minor coding nits follow... > > I'll go through these and update the code. I appreciate your review, > as I am new to ARM assembly. No problem -- it's already not far off. I think my comments were all tidiness rather than correctness issues. Cheers ---Dave > > > > > > > stmfd sp!, {r0, r1, fp, lr} ? > > > >> + ldmfd sp!, {r0, r1} > >> + > >> + @ allocate space on stack for return of new entry point of > >> + @ zImage, as EFI stub may copy the kernel. Pass address > >> + @ of space in r2 - EFI stub will fill in the pointer. > >> + > >> + sub sp, #8 @ we only need 4 bytes, > > > > I presume EFI guarantees a valid stack with 8-byte-aligned sp on entry? > > > > kernel asm is written in the traditional syntax, which means explicit > > source and destination registers for instructions like this: > > > > sub sp, sp, #8 > > > > Since the EFI stub code will only be built with new toolchains it > > probably doesn't matter, but it's best to be consistent for readability > > purposes. > > > >> + @ but keep stack 8 byte aligned. > >> + mov r2, sp > >> + @ Pass our actual runtime start address in pointer data > >> + adr r11, LC0 @ address of LC0 at run time > >> + ldr r12, [r11, #0] @ address of LC0 at link time > >> + > >> + sub r3, r11, r12 @ calculate the delta offset > >> + str r3, [r2, #0] > >> + bl efi_entry > >> + > >> + @ get new zImage entry address from stack, put into r3 > >> + ldr r3, [sp, #0] > >> + add sp, #8 @ restore stack > > > > add sp, sp, #8 > > > >> + > >> + @ Check for error return from EFI stub (0xFFFFFFFF) > >> + ldr r1, =0xffffffff > > > > Minor nit, but ldr= is wasteful for this. > > > > You could use mvn r1, #0 (or mov r1, #0xffffffff -- the assembler is > > smart enough to translate this)... > > > >> + cmp r0, r1 > > > > ...alternatively, don't use r1 at all and do: > > > > cmn r0, #1 > > > >> + beq efi_load_fail > >> + > >> + > >> + @ Save return values of efi_entry > >> + stmfd sp!, {r0, r3} > >> + bl cache_clean_flush > >> + bl cache_off > > > > Why turn the cache off? Does that mean that EFI may launch images with > > the cache enabled? > > > > If so, are we guaranteed that VA=PA? Otherwise simply turning the MMU > > off is not safe. > > > > (Hmm, the UEFI spec seems to suggest "yes" for these questions) > > > >> + ldmfd sp!, {r0, r3} > >> + > >> + @ put DTB address in r2, it was returned by EFI entry > >> + mov r2, r0 > >> + ldr r1, =0xffffffff @ DTB machine type > > > > mov/mvn: see above > > > >> + mov r0, #0 @ r0 is 0 > > > > Useless comment: maybe say why you're doing this ("r0 is 0, as required > > by the kernel boot protocol", or something like that). > > > >> + > >> + @ Branch to (possibly) relocated zImage entry that is in r3 > >> + bx r3 > >> + > >> +efi_load_fail: > >> + @ Return EFI_LOAD_ERROR to EFI firmware on error. > >> + @ Switch back to ARM mode for EFI is done based on > >> + @ return address on stack > >> + ldr r0, =0x80000001 > >> + ldmfd sp!, {fp, pc} > >> +#endif > >> + > >> +zimage_continue: > >> mrs r9, cpsr > >> #ifdef CONFIG_ARM_VIRT_EXT > >> bl __hyp_stub_install @ get into SVC mode, reversibly > >> @@ -167,7 +246,6 @@ not_angel: > >> * by the linker here, but it should preserve r7, r8, and r9. > >> */ > >> > >> - .text > >> > >> #ifdef CONFIG_AUTO_ZRELADDR > >> @ determine final kernel image address > >> -- > >> 1.7.10.4 > >> > >> > >> _______________________________________________ > >> linux-arm-kernel mailing list > >> linux-arm-kernel@xxxxxxxxxxxxxxxxxxx > >> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html