On 19/05/2023 10:44, Yafang Shao wrote: > On Thu, May 18, 2023 at 12:18 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote: >> >> As a means to continue the discussion in [1], which is >> concerned with finding the best long-term solution to >> having a BPF Type Format (BTF) representation of >> functions that is usable for tracing of edge cases, this >> proof-of-concept series is intended to explore one approach >> to adding information to help make tracing more accurate. >> >> A key problem today is that there is no matching from function >> description to the actual instances of a function. >> >> When that function only has one description, that is >> not an issue, but if we have multiple inconsistent >> static functions in different CUs such as >> >> From kernel/irq/irqdesc.c >> >> static ssize_t wakeup_show(struct kobject *kobj, >> struct kobj_attribute *attr, char *buf) >> >> ...and from drivers/base/power/sysfs.c >> >> static ssize_t wakeup_show(struct device *dev, struct device_attribute *attr, >> char *buf); >> >> ...this becomes a problem. If I am attaching, >> which do I want? And even if I know which one >> I want, which instance in kallsyms is which? >> > > As you described in the above example, it is natural to attach a > *function* defined in a specific *file_path*. So why not encoding the > file path instead ? What's the problem in it? > > If we expose the addr and let the user select which address to attach, > it will be a trouble to deploy a bpf program across multiple kernels. > While the file path will have a lower chance to be conflict between > different kernel versions. So I think it would be better if we use the > file path instead and let the kernel find the address automatically. > In the old days, when we wanted to deploy a kprobe kernel module or a > systemtap script across multiple kernels, we had to use if-else in the > code, which was really troublesome as it is not scalable. I don't > think we want to do it the same way in the bpf program. > I think it's important to distinguish between what we do in libbpf/kernel to ensure safe attach and what sorts of tracing capabilities a tracer supports. For example, a tracer (or indeed libbpf) can take the set of different instances of cpumask_weight() and because they all have the same BTF description, can attach to all instances. What we lose though by not having address-level accuracy is the ability to do more precise tracing. I also don't think matching addresses -> BTF ids stops us having file information in the mix; a "struct btf_func" could potentially contain a name offset to a string specifying the file too. My concern though is that relying on _just_ the file/line will not be enough in some cases. There are some weird edge cases in the kernel. I'll give an example where file/line isn't enough to figure out the mapping. Consider 'struct elf_note_info'; there are two copies of this in the vmlinux BTF: [16819] STRUCT 'elf_note_info' size=248 vlen=8 'thread' type_id=16817 bits_offset=0 'psinfo' type_id=16815 bits_offset=64 'signote' type_id=16815 bits_offset=256 'auxv' type_id=16815 bits_offset=448 'files' type_id=16815 bits_offset=640 'csigdata' type_id=6083 bits_offset=832 'size' type_id=74 bits_offset=1856 'thread_notes' type_id=21 bits_offset=1920 [16851] STRUCT 'elf_note_info' size=248 vlen=8 'thread' type_id=16850 bits_offset=0 'psinfo' type_id=16815 bits_offset=64 'signote' type_id=16815 bits_offset=256 'auxv' type_id=16815 bits_offset=448 'files' type_id=16815 bits_offset=640 'csigdata' type_id=6507 bits_offset=832 'size' type_id=74 bits_offset=1856 'thread_notes' type_id=21 bits_offset=1920 ...and here's the structure itself: struct elf_note_info { struct elf_thread_core_info *thread; struct memelfnote psinfo; struct memelfnote signote; struct memelfnote auxv; struct memelfnote files; user_siginfo_t csigdata; size_t size; int thread_notes; }; The reason there are two copies is not a dedup failure; in the first case user_siginfo_t is defined as [6083] TYPEDEF 'siginfo_t' type_id=6082 while in the second case it gets defined as [6507] TYPEDEF 'compat_siginfo_t' type_id=6506 The reason is include/linux/compat.h was included in one place when fs/binfmt_elf.c was built and not in another when fs/binfmt_elf.c was built the second time. Because the object was built into the kernel twice, as well as having two different instances of BTF descriptions for the same structure, we also have multiple different BTF function descriptions for functions that use a "struct elf_note_info *", and we need to figure out which maps to which kallsyms instance of the functions like this: ffffffff81489380 t fill_thread_core_info ffffffff8148ca90 t fill_thread_core_info Now, which BTF description goes with which instance? The file/line number is identical for each BTF description, but the BTF is (correctly) different because the object has been built into the kernel twice and is slightly different in each case. The difference isn't significant in practice here, but that doesn't mean it couldn't be in other cases. Imagine a case where one instance of the structure contained a field and the other didn't; we'd be interpreting the tracing data incorrectly. I realize it's a weird edge case, but that's just one I found from digging a bit. My worry is if we go with the file/line as a way of identifying the function, these sorts of edge cases will pile up and we'll need to go more fine-grained and use addresses in the end anyway. Again, that doesn't stop us applying higher-level semantics at a tracer level, but we need to build those on solid foundations. Alan >> This series is a proof-of-concept that supports encoding >> function addresses and associating them with BTF FUNC >> descriptions using BTF declaration tags. >> >> More work would need to be done on the kernel side >> to _use_ this representation, but hopefully having a >> rough approach outlined will help make that more feasible. >> >> [1] https://lore.kernel.org/bpf/ZF61j8WJls25BYTl@krava/ >> >> Alan Maguire (6): >> btf_encoder: record function address and if it is local >> dwarf_loader: store address in function low_pc if available >> dwarf_loader: transfer low_pc info from subtroutine to its abstract >> origin >> btf_encoder: add "addr=0x<addr>" function declaration tag if >> --btf_gen_func_addr specified >> btf_encoder: store ELF function representations sorted by name _and_ >> address >> pahole: document --btf_gen_func_addr >> >> btf_encoder.c | 64 +++++++++++++++++++++++++++++++++++----------- >> btf_encoder.h | 4 +-- >> dwarf_loader.c | 16 +++++++++--- >> dwarves.h | 3 +++ >> man-pages/pahole.1 | 8 ++++++ >> pahole.c | 12 +++++++-- >> 6 files changed, 85 insertions(+), 22 deletions(-) >> >> -- >> 2.31.1 >> > >