Re: [PATCH bpf-next 22/24] s390/bpf: Implement arch_prepare_bpf_trampoline()

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Thu, 26 Jan 2023 11:06:45 -0800

On Thu, Jan 26, 2023 at 6:30 AM Ilya Leoshkevich <iii@xxxxxxxxxxxxx> wrote:
>
> On Wed, 2023-01-25 at 17:15 -0800, Andrii Nakryiko wrote:
> > On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich <iii@xxxxxxxxxxxxx>
> > wrote:
> > >
> > > arch_prepare_bpf_trampoline() is used for direct attachment of eBPF
> > > programs to various places, bypassing kprobes. It's responsible for
> > > calling a number of eBPF programs before, instead and/or after
> > > whatever they are attached to.
> > >
> > > Add a s390x implementation, paying attention to the following:
> > >
> > > - Reuse the existing JIT infrastructure, where possible.
> > > - Like the existing JIT, prefer making multiple passes instead of
> > >   backpatching. Currently 2 passes is enough. If literal pool is
> > >   introduced, this needs to be raised to 3. However, at the moment
> > >   adding literal pool only makes the code larger. If branch
> > >   shortening is introduced, the number of passes needs to be
> > >   increased even further.
> > > - Support both regular and ftrace calling conventions, depending on
> > >   the trampoline flags.
> > > - Use expolines for indirect calls.
> > > - Handle the mismatch between the eBPF and the s390x ABIs.
> > > - Sign-extend fmod_ret return values.
> > >
> > > invoke_bpf_prog() produces about 120 bytes; it might be possible to
> > > slightly optimize this, but reaching 50 bytes, like on x86_64,
> > > looks
> > > unrealistic: just loading cookie, __bpf_prog_enter, bpf_func,
> > > insnsi
> > > and __bpf_prog_exit as literals already takes at least 5 * 12 = 60
> > > bytes, and we can't use relative addressing for most of them.
> > > Therefore, lower BPF_MAX_TRAMP_LINKS on s390x.
> > >
> > > Signed-off-by: Ilya Leoshkevich <iii@xxxxxxxxxxxxx>
> > > ---
> > >  arch/s390/net/bpf_jit_comp.c | 535
> > > +++++++++++++++++++++++++++++++++--
> > >  include/linux/bpf.h          |   4 +
> > >  2 files changed, 517 insertions(+), 22 deletions(-)
> > >
> >
> > [...]
> >
> > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > index cf89504c8dda..52ff43bbf996 100644
> > > --- a/include/linux/bpf.h
> > > +++ b/include/linux/bpf.h
> > > @@ -943,7 +943,11 @@ struct btf_func_model {
> > >  /* Each call __bpf_prog_enter + call bpf_func + call
> > > __bpf_prog_exit is ~50
> > >   * bytes on x86.
> > >   */
> > > +#if defined(__s390x__)
> > > +#define BPF_MAX_TRAMP_LINKS 27
> > > +#else
> > >  #define BPF_MAX_TRAMP_LINKS 38
> > > +#endif
> >
> > if we turn this into enum definition, then on selftests side we can
> > just discover this from vmlinux BTF, instead of hard-coding
> > arch-specific constants. Thoughts?
>
> This seems to work. I can replace 3/24 and 4/24 with that in v2.
> Some random notes:
>
> - It doesn't seem to be possible to #include "vlinux.h" into tests,
>   so one has to go through the btf__load_vmlinux_btf() dance and
>   allocate the fd arrays dynamically.

yes, you can't include vmlinux.h into user-space code, of course. And
yes it's true about needing to use btf__load_vmlinux_btf().

But I didn't get what you are saying about fd arrays, tbh. Can you
please elaborate?

>
> - One has to give this enum an otherwise unnecessary name, so that
>   it's easy to find. This doesn't seem like a big deal though:
>
> enum bpf_max_tramp_links {

not really, you can keep it anonymous enum. We do that in
include/uapi/linux/bpf.h for a lot of constants

> #if defined(__s390x__)
>         BPF_MAX_TRAMP_LINKS = 27,
> #else
>         BPF_MAX_TRAMP_LINKS = 38,
> #endif
> };
>
> - An alternative might be to expose this via /proc, since the users
>   might be interested in it too.

I'd say let's not, there is no need, having it in BTF is more than
enough for testing purposes

>
> > >
> > >  struct bpf_tramp_links {
> > >         struct bpf_tramp_link *links[BPF_MAX_TRAMP_LINKS];
> > > --
> > > 2.39.1
> > >
>