On Mon, Jul 11, 2022 at 11:25 AM Ilya Leoshkevich <iii@xxxxxxxxxxxxx> wrote: > > On Thu, 2022-07-07 at 13:59 -0700, Andrii Nakryiko wrote: > > On Thu, Jul 7, 2022 at 8:51 AM Ilya Leoshkevich <iii@xxxxxxxxxxxxx> > > wrote: > > > > > > On Wed, 2022-07-06 at 17:41 -0700, Andrii Nakryiko wrote: > > > > This RFC patch set is to gather feedback about new > > > > SEC("ksyscall") and SEC("kretsyscall") section definitions meant > > > > to > > > > simplify > > > > life of BPF users that want to trace Linux syscalls without > > > > having to > > > > know or > > > > care about things like CONFIG_ARCH_HAS_SYSCALL_WRAPPER and > > > > related > > > > arch-specific > > > > vs arch-agnostic __<arch>_sys_xxx vs __se_sys_xxx function names, > > > > calling > > > > convention woes ("nested" pt_regs), etc. All this is quite > > > > annoying > > > > to > > > > remember and care about as BPF user, especially if the goal is to > > > > write > > > > achitecture- and kernel version-agnostic BPF code (e.g., things > > > > like > > > > libbpf-tools, etc). > > > > > > > > By using SEC("ksyscall/xxx")/SEC("kretsyscall/xxx") user clearly > > > > communicates > > > > the desire to kprobe/kretprobe kernel function that corresponds > > > > to > > > > the > > > > specified syscall. Libbpf will take care of all the details of > > > > determining > > > > correct function name and calling conventions. > > > > > > > > This patch set also improves BPF_KPROBE_SYSCALL (and renames it > > > > to > > > > BPF_KSYSCALL to match SEC("ksyscall")) macro to take into account > > > > CONFIG_ARCH_HAS_SYSCALL_WRAPPER instead of hard-coding whether > > > > host > > > > architecture is expected to use syscall wrapper or not (which is > > > > less > > > > reliable > > > > and can change over time). > > > > > > > > It would be great to get feedback about the overall feature, but > > > > also > > > > I'd > > > > appreciate help with testing this, especially for non-x86_64 > > > > architectures. > > > > > > > > Cc: Ilya Leoshkevich <iii@xxxxxxxxxxxxx> > > > > Cc: Kenta Tada <kenta.tada@xxxxxxxx> > > > > Cc: Hengqi Chen <hengqi.chen@xxxxxxxxx> > > > > > > > > Andrii Nakryiko (3): > > > > libbpf: improve and rename BPF_KPROBE_SYSCALL > > > > libbpf: add ksyscall/kretsyscall sections support for syscall > > > > kprobes > > > > selftests/bpf: use BPF_KSYSCALL and SEC("ksyscall") in > > > > selftests > > > > > > > > tools/lib/bpf/bpf_tracing.h | 44 +++++-- > > > > tools/lib/bpf/libbpf.c | 109 > > > > ++++++++++++++++++ > > > > tools/lib/bpf/libbpf.h | 16 +++ > > > > tools/lib/bpf/libbpf.map | 1 + > > > > tools/lib/bpf/libbpf_internal.h | 2 + > > > > .../selftests/bpf/progs/bpf_syscall_macro.c | 6 +- > > > > .../selftests/bpf/progs/test_attach_probe.c | 6 +- > > > > .../selftests/bpf/progs/test_probe_user.c | 27 +---- > > > > 8 files changed, 172 insertions(+), 39 deletions(-) > > > > > > Hi Andrii, > > > > > > Looks interesting, I will give it a try on s390x a bit later. > > > > > > In the meantime just one remark: if we want to create a truly > > > seamless > > > solution, we might need to take care of quirks associated with the > > > following kernel #defines: > > > > > > * __ARCH_WANT_SYS_OLD_MMAP (real arguments are in memory) > > > * CONFIG_CLONE_BACKWARDS (child_tidptr/tls swapped) > > > * CONFIG_CLONE_BACKWARDS2 (newsp/clone_flags swapped) > > > * CONFIG_CLONE_BACKWARDS3 (extra arg: stack_size) > > > > > > or at least document that users need to be careful with mmap() and > > > clone() probes. Also, there might be more of that out there, but > > > that's > > > what I'm constantly running into on s390x. > > > > > > > Tbh, this space seems so messy, that I don't think it's realistic to > > try to have a completely seamless solution. As I replied to Alexei, I > > didn't have an intention to support compat and 32-bit syscalls, for > > example. This seems to be also a quirk that users will have to > > discover and handle on their own. In my mind there is always plain > > SEC("kprobe") if SEC("ksyscall") gets in a way to handle > > compat/32-bit/quirks like the ones you mentioned. > > > > But maybe the right answer is just to not add SEC("ksyscall") at all? > > I think it's a valuable feature, even if it doesn't handle compat > syscalls and all the other calling convention quirks. IMHO these things > just need to be clearly spelled in the documentation. > > In order to keep the possibility to handle them in the future, I would > write something like: > > At the moment SEC("ksyscall") does not handle all the calling > convention quirks for mmap(), clone() and compat syscalls. This may > or may not change in the future. Therefore it is recommended to use > SEC("kprobe") for these syscalls. > > What do you think? Sounds good! I'll add that to bpf_program__attach_ksyscall() doc comment (and to commit message). I'll implement those new virtual __kconfig variables that I mentioned in another thread and post it as v1, hopefully some time this week.