On 6/19/24 13:01, Sumanth Korikkar wrote: > On Tue, Jun 18, 2024 at 04:37:16PM -0400, Joe Lawrence wrote: >> On Mon, Feb 19, 2024 at 02:27:30PM +0100, Sumanth Korikkar wrote: >>> Hi All, >>> >>> This is a rebased version of Josh's patch series with a few fixups. >>> https://git.kernel.org/pub/scm/linux/kernel/git/jpoimboe/linux.git/log/?h=s390 >>> >>> This introduces the capability to compile the s390 relocatable kernel >>> with and without the -fPIE option. >>> >>> When utilizing the kpatch functionality, it is advisable to compile the >>> kernel without the -fPIE option. This is particularly important if the >>> kernel is built with the -ffunction-sections and -fdata-sections flags. >>> The linker imposes a restriction on the number of sections (limited to >>> 64k), necessitating the omission of -fPIE. >>> >>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622872.html >>> [2] https://gcc.gnu.org/pipermail/gcc-patches/2023-August/625986.html >>> >>> Gcc recently implemented an optimization [1] for loading symbols without >>> explicit alignment, aligning with the IBM Z ELF ABI. This ABI mandates >>> symbols to reside on a 2-byte boundary, enabling the use of the larl >>> instruction. However, kernel linker scripts may still generate unaligned >>> symbols. To address this, a new -munaligned-symbols option has been >>> introduced [2] in recent gcc versions. This option has to be used with >>> future gcc versions. >>> >>> Older Clang lacks support for handling unaligned symbols generated >>> by kernel linker scripts when the kernel is built without -fPIE. However, >>> future versions of Clang will include support for the -munaligned-symbols >>> option. When the support is unavailable, compile the kernel with -fPIE >>> to maintain the existing behavior. >>> >>> Patch 1 filters out -munaligned-symbol flag for vdso code. This is beneficial >>> when compiling kernel with -fno-PIE and -munaligned-symbols combination. >>> >>> Patch 2 introduces the 'relocs' tool, which reads the vmlinux file and >>> generates a vmlinux.relocs_64 section, containing offsets for all >>> R_390_64 relocations. >>> >>> Patch 3 enables the compilation of a relocatable kernel with or without >>> the -fPIE option. It allows for building the relocatable kernel without >>> -fPIE. However, if compiler cannot handle unaligned symbols, the kernel >>> is built with -fPIE. >>> >>> Patch 4 handles orphan .rela sections when kernel is built with >>> -fno-PIE. >>> >>> kpatch tools changes: >>> * -mno-pic-data-is-text-relative prevents relative addressing between >>> code and data. This is needed to avoid relocation error when klp text >>> and data are too far apart. kpatch already includes this flag. >>> However, with these changes, ARCH_KFLAGS+="-fPIC" should be added to >>> s390 kpatch tools, As -mno-pic-data-is-text-relative can be used only >>> with -fPIC. The corresponding pull request will be sent to kpatch >>> tools. >> >> Hi Sumanth, >> >> I noticed interesting compiler differences when adding -fPIC build >> option and not. The difference in resulting output can confuse >> kpatch-build when it tries to verify that its reference build (with the >> mentioned options, plus --ffunction-sections and -fdata-sections), >> doesn't line up closely enough with the original vmlinux source (sans >> all these options). > > Hi Joe, > > kpatch for s390 already uses extra compiler flag -mno-pic-data-is-text-relative > inorder to prevent relative addressing between code and data. Also, > includes -ffunction-sections and -fdata-sections along with it to identify > modified functions and its relocations. > > Both the source code and modified code are built with the same > options during kpatch-build (-fPIC added to > -mno-pic-data-is-text-relative). kpatch-build was able to identify > modified functions and its associated relocations and include these > changes in the final kpatch module. > > May be I am missing some info: Does this deviation cause confusion to kpatch? > Hi Sumanth, Yes, in the example I provided, the __mmput() function is only inlined in kpatch builds, but not the builds that create the target vmlinux. Here is a reproducer tarball that you can try against a local create-diff-object binary: https://file.rdu.redhat.com/~jolawren/repro-s390x-shadow-newpid.tar.gz create-diff-object: ERROR: fork.ORIG.o: find_local_syms: 222: couldn't find matching fork.c local symbols in vmlinux symbol table. > Adding -fPIC flag would end up with extra indirections via GOT for > eg: when accessing global vars. But, this is similar in case of previous > kpatch build as well (-fPIE + -mno-pic-data-is-text-relative). > > Rela comparision which I performed on .rela.text.mmput: > 1. with -fno-PIE kernel: > .rela.text.mmput: > 0000000000000022 000001a800000014 R_390_PLT32DBL 0000000000000000 __cond_resched + 2 > 0000000000000040 0000017800000013 R_390_PC32DBL 0000000000000000 __s390_indirect_jump_r14 + 2 > 000000000000004a 000001a000000014 R_390_PLT32DBL 0000000000000000 uprobe_clear_state + 2 > 0000000000000054 000001a100000014 R_390_PLT32DBL 0000000000000000 exit_aio + 2 > 000000000000006c 000001a200000014 R_390_PLT32DBL 0000000000000000 exit_mmap + 2 > 000000000000009a 000001a300000014 R_390_PLT32DBL 0000000000000000 fput + 2 > 00000000000000b8 000001a400000013 R_390_PC32DBL 0000000000000000 mmlist_lock + 2 <<< > 00000000000000cc 000001a500000014 R_390_PLT32DBL 0000000000000000 __list_del_entry_valid_or_report + 2 > 0000000000000100 000001a400000013 R_390_PC32DBL 0000000000000000 mmlist_lock + 4 > 000000000000011e 000001a600000014 R_390_PLT32DBL 0000000000000000 module_put + 2 > 0000000000000140 0000016f00000014 R_390_PLT32DBL 0000000000000000 __mmdrop + 2 > 000000000000014a 000001a700000014 R_390_PLT32DBL 0000000000000000 __ksm_exit + 2 > 0000000000000154 0000019700000014 R_390_PLT32DBL 0000000000000000 arch_spin_lock_wait + 2 > > 2. With -fPIC flag added: > With Relocation section '.rela.text.mmput' at offset 0x84c80 contains 12 entries: > Offset Info Type Symbol's Value Symbol's Name + Addend > 0000000000000022 000001d500000014 R_390_PLT32DBL 0000000000000000 __cond_resched + 2 > 0000000000000040 0000017800000013 R_390_PC32DBL 0000000000000000 __s390_indirect_jump_r14 + 2 > 000000000000004a 000001d600000014 R_390_PLT32DBL 0000000000000000 uprobe_clear_state + 2 > 0000000000000054 000001d700000014 R_390_PLT32DBL 0000000000000000 exit_aio + 2 > 0000000000000070 000001d800000014 R_390_PLT32DBL 0000000000000000 exit_mmap + 2 > 000000000000007e 000001d300000014 R_390_PLT32DBL 0000000000000000 set_mm_exe_file + 2 > 0000000000000090 000001d90000001a R_390_GOTENT 0000000000000000 mmlist_lock + 2 <<< Difference GOTENT > 00000000000000ac 000001da00000014 R_390_PLT32DBL 0000000000000000 __list_del_entry_valid_or_report + 2 > 00000000000000f8 000001db00000014 R_390_PLT32DBL 0000000000000000 module_put + 2 > 000000000000011a 0000016f00000014 R_390_PLT32DBL 0000000000000000 __mmdrop + 2 > 0000000000000124 000001dc00000014 R_390_PLT32DBL 0000000000000000 __ksm_exit + 2 > 0000000000000132 0000019500000014 R_390_PLT32DBL 0000000000000000 arch_spin_lock_wait + 2 > > 3. built with -fPIE commit + -mno-pic-data-is-text-relative (Previous kpatch version): > Relocation section '.rela.text.__mmput' at offset 0x95bc8 contains 13 entries: > Offset Info Type Symbol's Value Symbol's Name + Addend > 000000000000001c 000001c000000014 R_390_PLT32DBL 0000000000000000 uprobe_clear_state + 2 > 0000000000000026 000001c100000014 R_390_PLT32DBL 0000000000000000 exit_aio + 2 > 000000000000004c 000001c200000014 R_390_PLT32DBL 0000000000000000 exit_mmap + 2 > 0000000000000056 000001c300000014 R_390_PLT32DBL 0000000000000000 mm_put_huge_zero_page + 2 > 0000000000000084 000001c400000014 R_390_PLT32DBL 0000000000000000 fput + 2 > 000000000000009a 000001c50000001a R_390_GOTENT 0000000000000000 mmlist_lock + 2 <<<< > 0000000000000106 000001c600000014 R_390_PLT32DBL 0000000000000000 module_put + 2 > 0000000000000124 0000019000000013 R_390_PC32DBL 0000000000000000 __s390_indirect_jump_r14 + 2 > 000000000000012e 000001c700000014 R_390_PLT32DBL 0000000000000000 __khugepaged_exit + 2 > 000000000000013c 000001c800000014 R_390_PLT32DBL 0000000000000000 __ksm_exit + 2 > 000000000000014a 000001c900000014 R_390_PLT32DBL 0000000000000000 __list_del_entry_valid_or_report + 2 > 0000000000000158 000001a200000014 R_390_PLT32DBL 0000000000000000 arch_spin_lock_wait + 2 > 000000000000016c 0000018700000014 R_390_PLT32DBL 0000000000000000 __mmdrop + 2 > > >> >> I don't think a kpatch-build PR was ever opened to add "-fPIC", but the >> compiler now warns that its required. Have you ever seen optimization / >> build output changes when adding this option and if so, were there >> additional kpatch-build implications? > > I missed adding "-fPIC" KCFLAGS in kpatch tool. I will send a PR request. > > I expect and assume similar implications with (-fPIC added) > vs (-fPIE + -mno-pic-data-is-text-relative) build, as shown above in the rela > comparisions. > I don't have much personal mileage with kpatch on s390x, so you may be right that this is not unique to -fPIC. I suppose a workaround could be to force (non)inline in the input kpatch to match the target vmlinux. > Other Note: > The latest kernel is built with -fPIC and linked with -no-pie (reference > commit: ca888b17da9b ("s390: Compile kernel with -fPIC and link with > -no-pie")) which also avoids generation of dynamic symbols and helps > kpatch usecases (when num of sections >=64k sections). Also the build > options would be similar (-fPIC in kernel and -fPIC in kpatch-build) > > For latest kernel, there is no need to add explicit -fPIC again > in kpatch tool. > > But for the intermediate commits, yes, makes sense to add > it in kpatch-build tools and will create one PR. > Interesting! With 00cda11d3b2e ("s390: Compile kernel with -fPIC and link with -no-pie") it sounds like the original vmlinux would be built with -fPIC as well, so the optimization decisions re: __mmput() would likely be the same. I can retry the tests with v6.10-rcX to verify. >> Here is a quick example that I stumbled on while investigating the >> kpatch-build shadow-newpid.patch integration test that modifies >> kernel/fork.c. I couldn't reproduce with the s390 defconfig, but it >> shows up when building with RHEL-10 config. Reproducer steps and >> disassembly examples follows. > > Sorry Joe, I didnt get this question. "I couldnt reproduce with the s390 > defconfig". Could you please clarify it ? > In the email reproducer steps, if I ran `make defconfig` and tried the two builds of fork.o, __mmput() inlining didn't change. Only when I added the RHEL-10 s390x config could I reproduce. I didn't want to spend time hunting down the Red Hat specific CONFIG setting(s) that induced the build differences as I assumed there may be an obvious reason why this was happening. -- Joe