On Wed, Sep 25, 2024 at 10:01 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > On Wed, 25 Sept 2024 at 21:39, Uros Bizjak <ubizjak@xxxxxxxxx> wrote: > > > > On Wed, Sep 25, 2024 at 9:14 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > > > On Wed, 25 Sept 2024 at 20:54, Uros Bizjak <ubizjak@xxxxxxxxx> wrote: > > > > > > > > On Wed, Sep 25, 2024 at 5:02 PM Ard Biesheuvel <ardb+git@xxxxxxxxxx> wrote: > > > > > > > > > > From: Ard Biesheuvel <ardb@xxxxxxxxxx> > > > > > > > > > > Build the kernel as a Position Independent Executable (PIE). This > > > > > results in more efficient relocation processing for the virtual > > > > > displacement of the kernel (for KASLR). More importantly, it instructs > > > > > the linker to generate what is actually needed (a program that can be > > > > > moved around in memory before execution), which is better than having to > > > > > rely on the linker to create a position dependent binary that happens to > > > > > tolerate being moved around after poking it in exactly the right manner. > > > > > > > > > > Note that this means that all codegen should be compatible with PIE, > > > > > including Rust objects, so this needs to switch to the small code model > > > > > with the PIE relocation model as well. > > > > > > > > I think that related to this work is the patch series [1] that > > > > introduces the changes necessary to build the kernel as Position > > > > Independent Executable (PIE) on x86_64 [1]. There are some more places > > > > that need to be adapted for PIE. The patch series also introduces > > > > objtool functionality to add validation for x86 PIE. > > > > > > > > [1] "[PATCH RFC 00/43] x86/pie: Make kernel image's virtual address flexible" > > > > https://lore.kernel.org/lkml/cover.1682673542.git.houwenlong.hwl@xxxxxxxxxxxx/ > > > > > > > > > > Hi Uros, > > > > > > I am aware of that discussion, as I took part in it as well. > > > > > > I don't think any of those changes are actually needed now - did you > > > notice anything in particular that is missing? > > > > Some time ago I went through the kernel sources and proposed several > > patches that changed all trivial occurrences of non-RIP addresses to > > RIP ones. The work was partially based on the mentioned patch series, > > and I remember, I left some of them out [e.g. 1], because they > > required a temporary variable. > > I have a similar patch in my series, but the DEBUG_ENTRY code just uses > > pushf 1f@GOTPCREL(%rip) > > so no temporaries are needed. > > > Also, there was discussion about ftrace > > [2], where no solution was found. > > > > When linking with -z call-nop=suffix-nop, the __fentry__ call via the > GOT will be relaxed by the linker into a 5 byte call followed by a 1 > byte NOP, so I don't think we need to do anything special here. It > might mean we currently lose -mnop-mcount until we find a solution for > that in the compiler. In case you remember, I contributed and you > merged a GCC patch that makes the __fentry__ emission logic honour > -fdirect-access-external-data which should help here. This landed in > GCC 14. > > > Looking through your series, I didn't find some of the non-RIP -> RIP > > changes proposed by the original series (especially the ftrace part), > > and noticed that there is no objtool validator proposed to ensure that > > all generated code is indeed PIE compatible. > > > > What would be the point of that? The linker will complain and throw an > error if the code cannot be converted into a PIE executable, so I > don't think we need objtool's help for that. Indeed. > > Speaking of non-RIP -> RIP changes that require a temporary - would it > > be beneficial to make a macro that would use the RIP form only when > > #ifdef CONFIG_X86_PIE? That would avoid code size increase when PIE is > > not needed. > > > > This series does not make the PIE support configurable. Do you think > the code size increase is a concern if all GOT based symbol references > are elided, e.g, via -fdirect-access-external-data? I was looking at the code size measurement of the original patch series (perhaps these are not relevant with your series) and I think 2.2% - 2.4% code size increase can be problematic. Can you perhaps provide new code size increase measurements with your patches applied? Thanks and BR, Uros.