Re: [PATCH RFC bpf-next 0/3] libbpf: add better syscall kprobing support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2022-07-07 at 13:59 -0700, Andrii Nakryiko wrote:
> On Thu, Jul 7, 2022 at 8:51 AM Ilya Leoshkevich <iii@xxxxxxxxxxxxx>
> wrote:
> > 
> > On Wed, 2022-07-06 at 17:41 -0700, Andrii Nakryiko wrote:
> > > This RFC patch set is to gather feedback about new
> > > SEC("ksyscall") and SEC("kretsyscall") section definitions meant
> > > to
> > > simplify
> > > life of BPF users that want to trace Linux syscalls without
> > > having to
> > > know or
> > > care about things like CONFIG_ARCH_HAS_SYSCALL_WRAPPER and
> > > related
> > > arch-specific
> > > vs arch-agnostic __<arch>_sys_xxx vs __se_sys_xxx function names,
> > > calling
> > > convention woes ("nested" pt_regs), etc. All this is quite
> > > annoying
> > > to
> > > remember and care about as BPF user, especially if the goal is to
> > > write
> > > achitecture- and kernel version-agnostic BPF code (e.g., things
> > > like
> > > libbpf-tools, etc).
> > > 
> > > By using SEC("ksyscall/xxx")/SEC("kretsyscall/xxx") user clearly
> > > communicates
> > > the desire to kprobe/kretprobe kernel function that corresponds
> > > to
> > > the
> > > specified syscall. Libbpf will take care of all the details of
> > > determining
> > > correct function name and calling conventions.
> > > 
> > > This patch set also improves BPF_KPROBE_SYSCALL (and renames it
> > > to
> > > BPF_KSYSCALL to match SEC("ksyscall")) macro to take into account
> > > CONFIG_ARCH_HAS_SYSCALL_WRAPPER instead of hard-coding whether
> > > host
> > > architecture is expected to use syscall wrapper or not (which is
> > > less
> > > reliable
> > > and can change over time).
> > > 
> > > It would be great to get feedback about the overall feature, but
> > > also
> > > I'd
> > > appreciate help with testing this, especially for non-x86_64
> > > architectures.
> > > 
> > > Cc: Ilya Leoshkevich <iii@xxxxxxxxxxxxx>
> > > Cc: Kenta Tada <kenta.tada@xxxxxxxx>
> > > Cc: Hengqi Chen <hengqi.chen@xxxxxxxxx>
> > > 
> > > Andrii Nakryiko (3):
> > >   libbpf: improve and rename BPF_KPROBE_SYSCALL
> > >   libbpf: add ksyscall/kretsyscall sections support for syscall
> > > kprobes
> > >   selftests/bpf: use BPF_KSYSCALL and SEC("ksyscall") in
> > > selftests
> > > 
> > >  tools/lib/bpf/bpf_tracing.h                   |  44 +++++--
> > >  tools/lib/bpf/libbpf.c                        | 109
> > > ++++++++++++++++++
> > >  tools/lib/bpf/libbpf.h                        |  16 +++
> > >  tools/lib/bpf/libbpf.map                      |   1 +
> > >  tools/lib/bpf/libbpf_internal.h               |   2 +
> > >  .../selftests/bpf/progs/bpf_syscall_macro.c   |   6 +-
> > >  .../selftests/bpf/progs/test_attach_probe.c   |   6 +-
> > >  .../selftests/bpf/progs/test_probe_user.c     |  27 +----
> > >  8 files changed, 172 insertions(+), 39 deletions(-)
> > 
> > Hi Andrii,
> > 
> > Looks interesting, I will give it a try on s390x a bit later.
> > 
> > In the meantime just one remark: if we want to create a truly
> > seamless
> > solution, we might need to take care of quirks associated with the
> > following kernel #defines:
> > 
> > * __ARCH_WANT_SYS_OLD_MMAP (real arguments are in memory)
> > * CONFIG_CLONE_BACKWARDS (child_tidptr/tls swapped)
> > * CONFIG_CLONE_BACKWARDS2 (newsp/clone_flags swapped)
> > * CONFIG_CLONE_BACKWARDS3 (extra arg: stack_size)
> > 
> > or at least document that users need to be careful with mmap() and
> > clone() probes. Also, there might be more of that out there, but
> > that's
> > what I'm constantly running into on s390x.
> > 
> 
> Tbh, this space seems so messy, that I don't think it's realistic to
> try to have a completely seamless solution. As I replied to Alexei, I
> didn't have an intention to support compat and 32-bit syscalls, for
> example. This seems to be also a quirk that users will have to
> discover and handle on their own. In my mind there is always plain
> SEC("kprobe") if SEC("ksyscall") gets in a way to handle
> compat/32-bit/quirks like the ones you mentioned.
> 
> But maybe the right answer is just to not add SEC("ksyscall") at all?

I think it's a valuable feature, even if it doesn't handle compat
syscalls and all the other calling convention quirks. IMHO these things
just need to be clearly spelled in the documentation.

In order to keep the possibility to handle them in the future, I would
write something like:

    At the moment SEC("ksyscall") does not handle all the calling
    convention quirks for mmap(), clone() and compat syscalls. This may
    or may not change in the future. Therefore it is recommended to use
    SEC("kprobe") for these syscalls.

What do you think?



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux