On Fri, May 5, 2023 at 6:33 AM Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote: > > Em Fri, May 05, 2023 at 01:03:14AM +0200, Jiri Olsa escreveu: > > On Thu, May 04, 2023 at 03:03:42PM -0700, Ian Rogers wrote: > > > On Thu, May 4, 2023 at 2:48 PM Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote: > > > > > > > > Em Thu, May 04, 2023 at 04:07:29PM -0300, Arnaldo Carvalho de Melo escreveu: > > > > > Em Thu, May 04, 2023 at 11:50:07AM -0700, Andrii Nakryiko escreveu: > > > > > > On Thu, May 4, 2023 at 10:52 AM Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote: > > > > > > > Andrii, can you add some more information about the usage of vmlinux.h > > > > > > > instead of using kernel headers? > > > > > > > > > > > I'll just say that vmlinux.h is not a hard requirement to build BPF > > > > > > programs, it's more a convenience allowing easy access to definitions > > > > > > of both UAPI and kernel-internal structures for tracing needs and > > > > > > marking them relocatable using BPF CO-RE machinery. Lots of real-world > > > > > > applications just check-in pregenerated vmlinux.h to avoid build-time > > > > > > dependency on up-to-date host kernel and such. > > > > > > > > > > > If vmlinux.h generation and usage is causing issues, though, given > > > > > > that perf's BPF programs don't seem to be using many different kernel > > > > > > types, it might be a better option to just use UAPI headers for public > > > > > > kernel type definitions, and just define CO-RE-relocatable minimal > > > > > > definitions locally in perf's BPF code for the other types necessary. > > > > > > E.g., if perf needs only pid and tgid from task_struct, this would > > > > > > suffice: > > > > > > > > > > > struct task_struct { > > > > > > int pid; > > > > > > int tgid; > > > > > > } __attribute__((preserve_access_index)); > > > > > > > > > > Yeah, that seems like a way better approach, no vmlinux involved, libbpf > > > > > CO-RE notices that task_struct changed from this two integers version > > > > > (of course) and does the relocation to where it is in the running kernel > > > > > by using /sys/kernel/btf/vmlinux. > > > > > > > > Doing it for one of the skels, build tested, runtime untested, but not > > > > using any vmlinux, BTF to help, not that bad, more verbose, but at least > > > > we state what are the fields we actually use, have those attribute > > > > documenting that those offsets will be recorded for future use, etc. > > > > > > > > Namhyung, can you please check that this works? > > > > > > > > Thanks, > > > > > > > > - Arnaldo > > > > > > > > diff --git a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c > > > > index 6a438e0102c5a2cb..f376d162549ebd74 100644 > > > > --- a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c > > > > +++ b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c > > > > @@ -1,11 +1,40 @@ > > > > // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) > > > > // Copyright (c) 2021 Facebook > > > > // Copyright (c) 2021 Google > > > > -#include "vmlinux.h" > > > > +#include <linux/types.h> > > > > +#include <linux/bpf.h> > > > > > > Compared to vmlinux.h here be dragons. It is easy to start dragging in > > > all of libc and that may not work due to missing #ifdefs, etc.. Could > > > we check in a vmlinux.h like libbpf-tools does? > > > https://github.com/iovisor/bcc/tree/master/libbpf-tools#vmlinuxh-generation > > > https://github.com/iovisor/bcc/tree/master/libbpf-tools/arm64 > > > > > > This would also remove some of the errors that could be introduced by > > > copy+pasting enums, etc. and also highlight issues with things being > > > renamed as build time rather than runtime failures. > > > > we already have to deal with that, right? doing checks on fields in > > structs like mm_struct___old > > > > > Could this be some shared resource for the different linux tools > > > projects using a vmlinux.h? e.g. tools/lib/vmlinuxh with an > > > install_headers target that builds a vmlinux.h. > > > > I tried to do the minimal header and it's not too big, > > I pushed it in here: > > https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git/log/?h=perf/vmlinux_h > > > > compile tested so far > > I see it and it makes the change to be minimal, which is good at the > current stage, but I wonder if it wouldn't be better for us to define > just the ones not in UAPI and use the #include <linux/bpf.h>, > <linux/perf_event.h> as I did in the patches I posted here and Namhyung > tested at least one, this way the added vmlinux.h file get even smaller > by not including things like: > > [acme@quaco perf-tools]$ egrep -w '(perf_event_sample_format|bpf_perf_event_value|perf_sample_weight|perf_mem_data_src) {' include/uapi/linux/*.h > include/uapi/linux/bpf.h:struct bpf_perf_event_value { > include/uapi/linux/perf_event.h:enum perf_event_sample_format { > include/uapi/linux/perf_event.h:union perf_mem_data_src { > include/uapi/linux/perf_event.h:union perf_mem_data_src { > include/uapi/linux/perf_event.h:union perf_sample_weight { > [acme@quaco perf-tools]$ > > Also why do we need these: > > +struct mm_struct { > +} __attribute__((preserve_access_index)); > + > +struct raw_spinlock { > +} __attribute__((preserve_access_index)); > + > +typedef struct raw_spinlock raw_spinlock_t; > + > +struct spinlock { > +} __attribute__((preserve_access_index)); > + > +typedef struct spinlock spinlock_t; > + > +struct sighand_struct { > + spinlock_t siglock; > +} __attribute__((preserve_access_index)); > > We don't use them, they're just pointers you kept on: > > +struct task_struct { > + struct css_set *cgroups; > + pid_t pid; > + pid_t tgid; > + char comm[16]; > + struct mm_struct *mm; > + struct sighand_struct *sighand; > + unsigned int flags; > +} __attribute__((preserve_access_index)); > > That with the preserve_access_index isn't needed, we need just the > fields that we access in the tools, right? Aside from that you probably want to take a look at BTFgen. Old doc: https://github.com/aquasecurity/btfhub/blob/main/docs/btfgen-internals.md which landed as "bpftool gen min_core_btf" man bpftool-gen It addresses the use case for kernels _without_ CONFIG_DEBUG_INFO_BTF.