On Fri, Jul 14, 2023, at 9:51 AM, Greg Ungerer wrote: > On 14/7/23 00:26, Stefan O'Rear wrote: >> On Thu, Jul 13, 2023, at 9:17 AM, Greg Ungerer wrote: >>> On 13/7/23 01:12, Stefan O'Rear wrote: >>>> On Tue, Jul 11, 2023, at 9:07 AM, Greg Ungerer wrote: >>>>> Add support for enabling and using the binfmt_elf_fdpic program loader >>>>> on RISC-V platforms. The most important change is to setup registers >>>>> during program load to pass the mapping addresses to the new process. >>>>> >>>>> One of the interesting features of the elf-fdpic loader is that it >>>>> also allows appropriately compiled ELF format binaries to be loaded on >>>>> nommu systems. Appropriate being those compiled with -pie. >>>>> >>>>> Signed-off-by: Greg Ungerer <gerg@xxxxxxxxxx> >>>>> --- >>>>> v1->v2: rebase onto linux-6.5-rc1 >>>>> increment PTRACE_GETFDPIC value to keep it unique >>>>> >>>>> arch/riscv/include/asm/elf.h | 11 ++++++++++- >>>>> arch/riscv/include/asm/mmu.h | 4 ++++ >>>>> arch/riscv/include/uapi/asm/ptrace.h | 5 +++++ >>>>> fs/Kconfig.binfmt | 2 +- >>>>> 4 files changed, 20 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/arch/riscv/include/asm/elf.h b/arch/riscv/include/asm/elf.h >>>>> index c24280774caf..c33fe923ef6d 100644 >>>>> --- a/arch/riscv/include/asm/elf.h >>>>> +++ b/arch/riscv/include/asm/elf.h >>>>> @@ -41,6 +41,7 @@ extern bool compat_elf_check_arch(Elf32_Ehdr *hdr); >>>>> #define compat_elf_check_arch compat_elf_check_arch >>>>> >>>>> #define CORE_DUMP_USE_REGSET >>>>> +#define ELF_FDPIC_CORE_EFLAGS 0 >>>>> #define ELF_EXEC_PAGESIZE (PAGE_SIZE) >>>>> >>>>> /* >>>>> @@ -69,6 +70,13 @@ extern bool compat_elf_check_arch(Elf32_Ehdr *hdr); >>>>> #define ELF_HWCAP riscv_get_elf_hwcap() >>>>> extern unsigned long elf_hwcap; >>>>> >>>>> +#define ELF_FDPIC_PLAT_INIT(_r, _exec_map_addr, _interp_map_addr, >>>>> dynamic_addr) \ >>>>> + do { \ >>>>> + (_r)->a1 = _exec_map_addr; \ >>>>> + (_r)->a2 = _interp_map_addr; \ >>>>> + (_r)->a3 = dynamic_addr; \ >>>>> + } while (0) >>>>> + >>>> >>>> This should probably be left empty for now; it will be defined by the >>>> ELF FDPIC ABI when that is done, and shouldn't be used by normal ELF >>>> binaries. >>> >>> True, not used by the ELF binaries themselves. But used by an ELF >>> interpreter to do the runtime relocations. >> >> By "normal ELF binaries", I mean binaries (executables and shared libraries >> with nonzero e_entry, including statically linked binaries and the dynamic >> linker itself) that conform to the psABI as it currently exists in >> riscv-elf-psabi-doc. >> >> These binaries don't use load maps because they aren't defined in >> riscv-elf-psabi-doc yet. At some point in the future, riscv-elf-psabi-doc >> may define a FDPIC EI_OSABI value and rules for its usage; "FDPIC ELF >> binaries" will not be "normal" and you will not be able to load them with >> binfmt_elf or binfmt_elf_compat. >> >> "FDPIC elf binaries" which contain their own self-relocation code, including >> the dynamic linker and statically linked binaries, will use the map and >> dynamic addresses. > > Well, yes, none of that is specific to RISC-V though. Those same rules apply to > FDPIC binaries on any architecture that supports it. FDPIC binaries can only > be loaded by binfmt_elf_fdpic. > > The point I was trying to make is that with a normal ELF binary (with obvious > limitations in the noMMU case - needs to be PIE, and not using shared libraries, > etc) if those mappings are present a specially crafted ELF interpreter can > use them to carry out the relocations and start that ELF binary. That ELF > binary is not special and can be loaded and run by the the standard binfmt_elf > ELF loader on an a fully MMU system (of course using the usual ELF interpreter > on that system not the different noMMU one). I agree with everything in these two paragraphs. The point I'm trying to make is that the "specially crafted ELF interpreter" doesn't need to use the load maps - all of the information in exec_loadmap and dynamic can be obtained by scanning AT_PHDR and adding the an offset to all of them such that the address of PT_PHDR is the value of AT_PHDR; all of the information in interp_loadmap can be obtained using AT_BASE. I believe that musl ldso is already designed to operate in this way and it already has fallback code for read instead of failing mmaps, if you add #define DL_NOMMU_SUPPORT 1 to arch/riscv64/reloc.h, although I haven't tested it. I do not understand the glibc or uclibc interpreters well enough to make statements about them. >> elf_check_fdpic will return 1 for binaries that use the FDPIC EI_OSABI and >> use the load maps as part of either their own or their interpreter's >> startup code. For binaries where elf_check_fdpic returns 0, the contents >> of ELF_FDPIC_PLAT_INIT is (harmless) dead stores. > > I don't follow. On other architectures that binfmt_elf-fdpic supoprts (so ARM, > SH, M68K currently; historically some other removed architectures too) the > contents of ELF_FDPIC_PLAT_INIT are always run if defined. Even in the case of > loading a standard ELF binary. Those mappings are loaded into some registers > and those appear in the running process on startup. Dead in the sense that there is no way for a constdisp ELF executable, which must run correctly with binfmt_elf, to make use of them - if I have understood ELF_PLAT_INIT and start_thread correctly, RISC-V executables and interpreters currently start with garbage from the pre-execve state in all integer registers other than sp, so there is no way to distinguish "valid load maps" from "nonzero values from previous image". -s >> We don't have an implementation of elf_check_fdpic yet, which means that >> we cannot load binaries that actually use the load maps and the >> ELF_FDPIC_PLAT_INIT is dead code. Dead code which is unlikely to conform >> to the future ABI, since the RISC-V psABI Task Group has yet to choose >> which registers to use to pass the load maps. Since the code is dead and >> likely wrong, it should be removed so that correct code can be added at >> the correct time. > > I understand your reluctance here. > > Regards > Greg > > >> -s >> >>>> I'd ask if there's a reason it starts at a1 instead of a0, >>>> but it seems idiosyncratic on all arches that have full FDPIC support. >>> >>> This comment in the crt1.S code of uClibc made me think that a0 already had >>> a pre-defined use in the ABI: >>> >>> /* The entry point's job is to call __uClibc_main. Per the ABI, >>> a0 contains the address of a function to be passed to atexit. >>> >>> But I didn't dig any further than that. >>> >>> Regards >>> Greg >>> >>> >>>> -s >>>> >>>>> /* >>>>> * This yields a string that ld.so will use to load implementation >>>>> * specific libraries for optimization. This is more specific in >>>>> @@ -78,7 +86,6 @@ extern unsigned long elf_hwcap; >>>>> >>>>> #define COMPAT_ELF_PLATFORM (NULL) >>>>> >>>>> -#ifdef CONFIG_MMU >>>>> #define ARCH_DLINFO \ >>>>> do { \ >>>>> /* \ >>>>> @@ -115,6 +122,8 @@ do { \ >>>>> else \ >>>>> NEW_AUX_ENT(AT_IGNORE, 0); \ >>>>> } while (0) >>>>> + >>>>> +#ifdef CONFIG_MMU >>>>> #define ARCH_HAS_SETUP_ADDITIONAL_PAGES >>>>> struct linux_binprm; >>>>> extern int arch_setup_additional_pages(struct linux_binprm *bprm, >>>>> diff --git a/arch/riscv/include/asm/mmu.h b/arch/riscv/include/asm/mmu.h >>>>> index 0099dc116168..355504b37f8e 100644 >>>>> --- a/arch/riscv/include/asm/mmu.h >>>>> +++ b/arch/riscv/include/asm/mmu.h >>>>> @@ -20,6 +20,10 @@ typedef struct { >>>>> /* A local icache flush is needed before user execution can resume. */ >>>>> cpumask_t icache_stale_mask; >>>>> #endif >>>>> +#ifdef CONFIG_BINFMT_ELF_FDPIC >>>>> + unsigned long exec_fdpic_loadmap; >>>>> + unsigned long interp_fdpic_loadmap; >>>>> +#endif >>>>> } mm_context_t; >>>>> >>>>> void __init create_pgd_mapping(pgd_t *pgdp, uintptr_t va, phys_addr_t >>>>> pa, >>>>> diff --git a/arch/riscv/include/uapi/asm/ptrace.h >>>>> b/arch/riscv/include/uapi/asm/ptrace.h >>>>> index e17c550986a6..30f6d6537adc 100644 >>>>> --- a/arch/riscv/include/uapi/asm/ptrace.h >>>>> +++ b/arch/riscv/include/uapi/asm/ptrace.h >>>>> @@ -10,6 +10,11 @@ >>>>> >>>>> #include <linux/types.h> >>>>> >>>>> +#define PTRACE_GETFDPIC 33 >>>>> + >>>>> +#define PTRACE_GETFDPIC_EXEC 0 >>>>> +#define PTRACE_GETFDPIC_INTERP 1 >>>>> + >>>>> /* >>>>> * User-mode register state for core dumps, ptrace, sigcontext >>>>> * >>>>> diff --git a/fs/Kconfig.binfmt b/fs/Kconfig.binfmt >>>>> index 93539aac0e5b..f5693164ca9a 100644 >>>>> --- a/fs/Kconfig.binfmt >>>>> +++ b/fs/Kconfig.binfmt >>>>> @@ -58,7 +58,7 @@ config ARCH_USE_GNU_PROPERTY >>>>> config BINFMT_ELF_FDPIC >>>>> bool "Kernel support for FDPIC ELF binaries" >>>>> default y if !BINFMT_ELF >>>>> - depends on ARM || ((M68K || SUPERH || XTENSA) && !MMU) >>>>> + depends on ARM || ((M68K || RISCV || SUPERH || XTENSA) && !MMU) >>>>> select ELFCORE >>>>> help >>>>> ELF FDPIC binaries are based on ELF, but allow the individual load >>>>> -- >>>>> 2.25.1 >>>>> >>>>> >>>>> _______________________________________________ >>>>> linux-riscv mailing list >>>>> linux-riscv@xxxxxxxxxxxxxxxxxxx >>>>> http://lists.infradead.org/mailman/listinfo/linux-riscv >>> >>> _______________________________________________ >>> linux-riscv mailing list >>> linux-riscv@xxxxxxxxxxxxxxxxxxx >>> http://lists.infradead.org/mailman/listinfo/linux-riscv