On Sat, 7 Dec 2024 at 11:46, Xi Ruoyao <xry111@xxxxxxxxxxx> wrote: > > On Sat, 2024-12-07 at 10:32 +0100, Greg Kroah-Hartman wrote: > > On Sat, Dec 07, 2024 at 05:21:00PM +0800, Huacai Chen wrote: > > > Hi, Greg, > > > > > > On Fri, Dec 6, 2024 at 9:04 PM Greg Kroah-Hartman > > > <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: > > > > > > > > On Fri, Dec 06, 2024 at 04:58:07PM +0800, Huacai Chen wrote: > > > > > Backport this series to 6.1&6.6 because LoongArch gets build errors with > > > > > latest binutils which has commit 599df6e2db17d1c4 ("ld, LoongArch: print > > > > > error about linking without -fPIC or -fPIE flag in more detail"). > > > > > > > > > > CC .vmlinux.export.o > > > > > UPD include/generated/utsversion.h > > > > > CC init/version-timestamp.o > > > > > LD .tmp_vmlinux.kallsyms1 > > > > > loongarch64-unknown-linux-gnu-ld: kernel/kallsyms.o:(.text+0): relocation R_LARCH_PCALA_HI20 against `kallsyms_markers` can not be used when making a PIE object; recompile with -fPIE > > > > > loongarch64-unknown-linux-gnu-ld: kernel/crash_core.o:(.init.text+0x984): relocation R_LARCH_PCALA_HI20 against `kallsyms_names` can not be used when making a PIE object; recompile with -fPIE > > > > > loongarch64-unknown-linux-gnu-ld: kernel/bpf/btf.o:(.text+0xcc7c): relocation R_LARCH_PCALA_HI20 against `__start_BTF` can not be used when making a PIE object; recompile with -fPIE > > > > > loongarch64-unknown-linux-gnu-ld: BFD (GNU Binutils) 2.43.50.20241126 assertion fail ../../bfd/elfnn-loongarch.c:2673 > > > > > > > > > > In theory 5.10&5.15 also need this, but since LoongArch get upstream at > > > > > 5.19, so I just ignore them because there is no error report about other > > > > > archs now. > > > > > > > > Odd, why doesn't this affect other arches as well using new binutils? I > > > > hate to have to backport all of this just for one arch, as that feels > > > > odd. > > > The related binutils commit is only for LoongArch, so build errors > > > only occured on LoongArch. I don't know why other archs have no > > > problem exactly, but may be related to their CFLAGS (for example, if > > > we disable CONFIG_RELOCATABLE, LoongArch also has no build errors > > > because CFLAGS changes). > > > > does LoongArch depend on that option? > > "That option" is -mdirect-extern-access. Without it we'll use GOT in > the kernel image to address anything out of the current TU, bloating the > kernel size and making it slower. > An alternative to this might be to add -include $(srctree)/include/linux/hidden.h to KBUILD_CFLAGS_KERNEL, so that the compiler understands that all external references are resolved at link time, not at load/run time. > The problem is the linker failed to handle a direct access to undefined > weak symbol on LoongArch. ... > With Binutils trunk, an error is emitted instead of silently producing > buggy executable. Still I don't think emitting an error is correct when > linking a static PIE (our vmlinux is a static PIE). Instead the linker > should just rewrite > > pcalau12i rd, %pc_hi20(undef_weak) > > to > > move rd, $zero > Is that transformation even possible at link time? Isn't pc_hi20 part of a pair? > Also the "recompile with -fPIE" suggestion in the error message is > completely misleading. We are *already* compiling relocatable kernel > with -fPIE. > And this is the most important difference between LoongArch and the other arches - LoongArch already uses PIC code explicitly. Other architectures use ordinary position dependent codegen and linking, or -in the case of arm64- use position dependent codegen and PIE linking, where the fact that this is even possible is a happy accident. ... > > What happens if it is enabled for other arches? Why doesn't it break > > them? > > The other arches have copy relocation, so their -mdirect-extern-access > is intended to work with dynamically linked executable, thus it's the > default and not as strong as ours. On them -mdirect-extern-access still > uses GOT to address weak symbols. > > We don't have copy relocation, thus our default is -mno-direct-extern- > access, and -mdirect-extern-access is only intended for static > executables (including OS kernel, embedded firmware, etc). So it's > designed to be stronger, unfortunately the toolchain failed to implement > it correctly. > This has nothing to do with copy relocations - those are only relevant when shared libraries come into play. Other architectures don't break because they either a) use position dependent codegen with absolute addressing, and simply resolve undefined weak references as 0x0, or b) use GOT indirection, where the reference is a GOT load and the address in the GOT is set to 0x0. So the issue here appears to be that the compiler fails to emit a GOT entry for this reference, even though it is performing PIC codegen. This is probably due to -mdirect-extern-access being taken into account too strictly. The upshot is that a relative reference is emitted to an undefined symbol, and it is impossible for a relative reference to [reliably] yield NULL, and so the reference produces a bogus non-NULL address. As these patches deal with symbols that are only undefined in the preliminary first linker pass, and are guaranteed to exist afterwards, silently emitting a bogus relative reference was not a problem in these cases. Obviously, throwing an error is. The patches should be rather harmless in practice, but I know Masahiro did not like the approach for the kallsyms markers, and made some subsequent modifications to it. Given that this is relatively new toolchain behavior, I'd suggest fixing the compiler to emit weak external references via GOT entries even when -mdirect-extern-access is in effect.