Re: [PATCH v10 02/15] livepatch: avoid position-based search if `-z unique-symbol` is available

Alexander Lobakin <alexandr.lobakin@xxxxxxxxx> · Fri, 18 Feb 2022 17:31:11 +0100

From: Miroslav Benes <mbenes@xxxxxxx>
Date: Wed, 16 Feb 2022 16:15:20 +0100 (CET)

> On Fri, 11 Feb 2022, Josh Poimboeuf wrote:
> 
> > On Fri, Feb 11, 2022 at 10:05:02AM -0800, Fāng-ruì Sòng wrote:
> > > On Fri, Feb 11, 2022 at 9:41 AM Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> > > >
> > > > On Wed, Feb 09, 2022 at 07:57:39PM +0100, Alexander Lobakin wrote:
> > > > > Position-based search, which means that if there are several symbols
> > > > > with the same name, the user needs to additionally provide the
> > > > > "index" of a desired symbol, is fragile. For example, it breaks
> > > > > when two symbols with the same name are located in different
> > > > > sections.
> > > > >
> > > > > Since a while, LD has a flag `-z unique-symbol` which appends
> > > > > numeric suffixes to the functions with the same name (in symtab
> > > > > and strtab). It can be used to effectively prevent from having
> > > > > any ambiguity when referring to a symbol by its name.
> > > >
> > > > In the patch description can you also give the version of binutils (and
> > > > possibly other linkers) which have the flag?
> > > 
> > > GNU ld>=2.36 supports -z unique-symbol. ld.lld doesn't support -z unique-symbol.
> > > 
> > > I subscribe to llvm@xxxxxxxxxxxxxxx and happen to notice this message
> > > (can't keep up with the changes...)
> > > I am a bit concerned with this option and replied last time on
> > > https://lore.kernel.org/r/20220105032456.hs3od326sdl4zjv4@xxxxxxxxxx
> > > 
> > > My full reasoning is on
> > > https://maskray.me/blog/2020-11-15-explain-gnu-linker-options#z-unique-symbol
> > 
> > Ah, right.  Also discussed here:
> > 
> >   https://lore.kernel.org/all/20210123225928.z5hkmaw6qjs2gu5g@xxxxxxxxxx/T/#u
> >   https://lore.kernel.org/all/20210125172124.awabevkpvq4poqxf@treble/
> > 
> > I'm not qualified to comment on LTO/PGO stability issues, but it doesn't
> > sound good.  And we want to support livepatch for LTO kernels.
> 
> Hm, bear with me, because I am very likely missing something which is 
> clear to everyone else...
> 
> Is the stability really a problem for the live patching (and I am talking 
> about the live patching only here. It may be a problem elsewhere, but I am 
> just trying to understand.)? I understand that two different kernel builds 
> could have a different name mapping between the original symbols and their 
> unique renames. Not nice. But we can prepare two different live patches 
> for these two different kernels. Something one would like to avoid if 
> possible, but it is not impossible. Am I missing something?
>  
> > Also I realized that this flag would have a negative effect on
> > kpatch-build, as it currently does its analysis on .o files.  So it
> > would have to figure out how to properly detect function renames, to
> > avoid patching the wrong function for example.
> 
> Yes, that is unfortunate. And not only for kpatch-build.
> 
> > And if LLD doesn't plan to support the flag then it will be a headache
> > for livepatch (and the kernel in general) to deal with the divergent
> > configs.
> 
> True.
> 
> The position-based approach clearly shows its limits. I like <file+func> 
> approach based on kallsyms tracking, that you proposed elsewhere in the 
> thread, more.

Hmm, same.

For FG-KASLR part, `-ffunction-sections` has no options, it only
appends the function name to the name of a function, i.e. it can
be only ".text.dup".
However, LD scripts allow to specify a particular input file for
the section being described, i.e.:

.text.dup {         .text.file1_dup {
    (.text.dup) ->      file1.o(.text.dup)
}                   }
                    .text.file2_dup {
                        file2.o(.text.dup)
                    }

But the problem is that currently vmlinux is being linked from
vmlinux.o solely, so there are no input files apart from vmlinux.o.
I could probably (not 100% sure, I'm not deep into the details of
thin archives) create a temporary linker script for vmlinux.o
itself to process duplicates. Then vmlinux.o will always have only
unique section names right from the start.
It may not worth it: I don't mind that random functions with the
same name go into one section, it's not a big deal and/or security
risk, and it doesn't help livepatch which operates with symbol
names, not sections.

Re livepatch, the best option would probably be storing relative
paths to the object files in kallsyms. By relative I mean starting
from $srctree -- this would keep their versatility (no abspaths),
but provide needed uniquity:

dup()    main.o:dup()    init/main.o:dup()       /mnt/init/main.o:dup()
dup()    main.o:dup()    foo/bar/main.o:dup()    /mnt/foo/bar/main.o:dup()

                         ^^^^^^ here ^^^^^^

The problem is that kallsyms are being generated at the moment of
(re)linking vmlinux already and no earlier.
If I could catch STT_FILE (can't say for sure now), it would provide
only filenames, so wouldn't be enough.
...oh wait, kallsyms rely on `nm` output. I checked nm's `-l` which
tries to find a file corresponding to each symbol and got a nice
output:

ffffffff8109ad00 T switch_mm_irqs_off	/home/alobakin/Documents/work/xdp_hints/linux/arch/x86/mm/tlb.c:488

So this could be parsed with no issues nto:

name: switch_mm_irqs_off
addr: 0x9ad00 (rel)
file: arch/x86/mm/tlb.c

This solves a lot. One problem is that

> time nm -ln vmlinux > ~/Documents/tmp/nml
nm -ln vmlinux > ~/Documents/tmp/nml  120.80s user 1.77s system 99% cpu 2:02.94 total

it took 2 minutes to generate the whole map (instead of a split
second) (on 64-core CPU, but I guess nm runs in one thread).
I guess it can be optimized? I'm no a binutils master (will take a
look after sending this), is there a way to do it manually skipping
this nm lag or maybe make nm emit filenames without such delays?

> 
> Miroslav

Thanks,
Al