On Mon, Sep 2, 2024 at 10:57 PM Viktor Malik <vmalik@xxxxxxxxxx> wrote: > > On 9/2/24 19:01, Alan Maguire wrote: > > On 02/09/2024 07:58, Viktor Malik wrote: > >> TL;DR > >> > >> This adds libbpf support for creating multiple BPF programs having the > >> same instructions using symbol aliases. > >> > >> Context > >> ======= > >> > >> bpftrace has so-called "wildcarded" probes which allow to attach the > >> same program to multple different attach points. For k(u)probes, this is > >> easy to do as we can leverage k(u)probe_multi, however, other program > >> types (fentry/fexit, tracepoints) don't have such features. > >> > >> Currently, what bpftrace does is that it creates a copy of the program > >> for each attach point. This naturally results in a lot of redundant code > >> in the produced BPF object. > >> > >> Proposal > >> ======== > >> > >> One way to address this problem would be to use *symbol aliases*. In > >> short, they allow to have multiple symbol table entries for the same > >> address. In bpftrace, we would create them using llvm::GlobalAlias. In > >> C, it can be achieved using compiler __attribute__((alias(...))): > >> > >> int BPF_PROG(prog) > >> { > >> [...] > >> } > >> int prog_alias() __attribute__((alias("prog"))); > >> > >> When calling bpf_object__open, libbpf is currently able to discover all > >> the programs and internally does a separate copy of the instructions for > >> each aliased program. What libbpf cannot do, is perform relocations b/c > >> it assumes that each instruction belongs to a single program only. The > >> second patch of this series changes relocation collection such that it > >> records relocations for each aliased program. With that, bpftrace can > >> emit just one copy of the full program and an alias for each target > >> attach point. > >> > >> For example, considering the following bpftrace script collecting the > >> number of hits of each VFS function using fentry over a one second > >> period: > >> > >> $ bpftrace -e 'kfunc:vfs_* { @[func] = count() } i:s:1 { exit() }' > >> [...] > >> > >> this change will allow to reduce the size of the in-memory BPF object > >> that bpftrace generates from 60K to 9K. > >> > >> For reference, the bpftrace PoC is in [1]. > >> > >> The advantage of this change is that for BPF objects without aliases, it > >> doesn't introduce any overhead. > >> > > > > A few high-level questions - apologies in advance if I'm missing the > > point here. > > > > Could bpftrace use program linking to solve this issue instead? So we'd > > have separate progs for the various attach points associated with vfs_* > > functions, but they would all call the same global function. That > > _should_ reduce the memory footprint of the object I think - or are > > there issues with doing that? > > That's a good suggestion, thanks! We added subprograms to bpftrace only > relatively recently so I didn't really think about this option. I'll > definitely give it a try as it could be even more efficient. > > > I also wonder if aliasing helps memory > > footprint fully, especially if we end up with separate copies of the > > program for relocation purposes; won't we have separate copies in-kernel > > then too? So I _think_ the memory utilization you're concerned about is > > not what's running in the kernel, but the BPF object representation in > > bpftrace; is that right? > > Yes, that is correct. libbpf will create a copy of the program for each > symbol in PROGBITS section that it discovers (including aliases) and the > copies will be loaded into kernel. > > It's mainly the footprint of the BPF object produced by bpftrace that I > was concerned about. (The reason is that we work on ahead-of-time > compilation so it will directly affect the size of the pre-compiled > binaries). But the above solution using global subprograms should reduce > the in-kernel footprint, too, so I'll try to add it and see if it would > work for bpftrace. I think it's a half solution, since prog will be duplicated anyway and loaded multiple times into the kernel. I think it's better to revive this patch sets: v1: https://lore.kernel.org/bpf/20240220035105.34626-1-dongmenglong.8@xxxxxxxxxxxxx/ v2: https://lore.kernel.org/bpf/20240311093526.1010158-1-dongmenglong.8@xxxxxxxxxxxxx/ Unfortunately it went off rails, because we went deep into trying to save bpf trampoline memory too. I think it can still land. We can load fentry program once and attach it in multiple places with many bpf links. That is still worth doing. I think it should solve retsnoop and bpftrace use cases. Not perfect, since there will be many trampolines, but still an improvement.