Re: [PATCH bpf-next 22/24] s390/bpf: Implement arch_prepare_bpf_trampoline()

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Fri, 27 Jan 2023 09:30:42 -0800



On Fri, Jan 27, 2023 at 3:15 AM Ilya Leoshkevich <iii@xxxxxxxxxxxxx> wrote:
>
> On Thu, 2023-01-26 at 11:06 -0800, Andrii Nakryiko wrote:
> > On Thu, Jan 26, 2023 at 6:30 AM Ilya Leoshkevich <iii@xxxxxxxxxxxxx>
> > wrote:
> > >
> > > On Wed, 2023-01-25 at 17:15 -0800, Andrii Nakryiko wrote:
> > > > On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich
> > > > <iii@xxxxxxxxxxxxx>
> > > > wrote:
> > > > >
> > > > > arch_prepare_bpf_trampoline() is used for direct attachment of
> > > > > eBPF
> > > > > programs to various places, bypassing kprobes. It's responsible
> > > > > for
> > > > > calling a number of eBPF programs before, instead and/or after
> > > > > whatever they are attached to.
> > > > >
> > > > > Add a s390x implementation, paying attention to the following:
> > > > >
> > > > > - Reuse the existing JIT infrastructure, where possible.
> > > > > - Like the existing JIT, prefer making multiple passes instead
> > > > > of
> > > > >   backpatching. Currently 2 passes is enough. If literal pool
> > > > > is
> > > > >   introduced, this needs to be raised to 3. However, at the
> > > > > moment
> > > > >   adding literal pool only makes the code larger. If branch
> > > > >   shortening is introduced, the number of passes needs to be
> > > > >   increased even further.
> > > > > - Support both regular and ftrace calling conventions,
> > > > > depending on
> > > > >   the trampoline flags.
> > > > > - Use expolines for indirect calls.
> > > > > - Handle the mismatch between the eBPF and the s390x ABIs.
> > > > > - Sign-extend fmod_ret return values.
> > > > >
> > > > > invoke_bpf_prog() produces about 120 bytes; it might be
> > > > > possible to
> > > > > slightly optimize this, but reaching 50 bytes, like on x86_64,
> > > > > looks
> > > > > unrealistic: just loading cookie, __bpf_prog_enter, bpf_func,
> > > > > insnsi
> > > > > and __bpf_prog_exit as literals already takes at least 5 * 12 =
> > > > > 60
> > > > > bytes, and we can't use relative addressing for most of them.
> > > > > Therefore, lower BPF_MAX_TRAMP_LINKS on s390x.
> > > > >
> > > > > Signed-off-by: Ilya Leoshkevich <iii@xxxxxxxxxxxxx>
> > > > > ---
> > > > >  arch/s390/net/bpf_jit_comp.c | 535
> > > > > +++++++++++++++++++++++++++++++++--
> > > > >  include/linux/bpf.h          |   4 +
> > > > >  2 files changed, 517 insertions(+), 22 deletions(-)
> > > > >
> > > >
> > > > [...]
> > > >
> > > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > > > index cf89504c8dda..52ff43bbf996 100644
> > > > > --- a/include/linux/bpf.h
> > > > > +++ b/include/linux/bpf.h
> > > > > @@ -943,7 +943,11 @@ struct btf_func_model {
> > > > >  /* Each call __bpf_prog_enter + call bpf_func + call
> > > > > __bpf_prog_exit is ~50
> > > > >   * bytes on x86.
> > > > >   */
> > > > > +#if defined(__s390x__)
> > > > > +#define BPF_MAX_TRAMP_LINKS 27
> > > > > +#else
> > > > >  #define BPF_MAX_TRAMP_LINKS 38
> > > > > +#endif
> > > >
> > > > if we turn this into enum definition, then on selftests side we
> > > > can
> > > > just discover this from vmlinux BTF, instead of hard-coding
> > > > arch-specific constants. Thoughts?
> > >
> > > This seems to work. I can replace 3/24 and 4/24 with that in v2.
> > > Some random notes:
> > >
> > > - It doesn't seem to be possible to #include "vlinux.h" into tests,
> > >   so one has to go through the btf__load_vmlinux_btf() dance and
> > >   allocate the fd arrays dynamically.
> >
> > yes, you can't include vmlinux.h into user-space code, of course. And
> > yes it's true about needing to use btf__load_vmlinux_btf().
> >
> > But I didn't get what you are saying about fd arrays, tbh. Can you
> > please elaborate?
>
> That's a really minor thing; fexit_fd and and link_fd in fexit_stress
> now need to be allocated dynamically.
>
> > > - One has to give this enum an otherwise unnecessary name, so that
> > >   it's easy to find. This doesn't seem like a big deal though:
> > >
> > > enum bpf_max_tramp_links {
> >
> > not really, you can keep it anonymous enum. We do that in
> > include/uapi/linux/bpf.h for a lot of constants
>
> How would you find it then? My current code is:
>
> int get_bpf_max_tramp_links_from(struct btf *btf)
> {
>         const struct btf_enum *e;
>         const struct btf_type *t;
>         const char *name;
>         int id;
>
>         id = btf__find_by_name_kind(btf, "bpf_max_tramp_links",
> BTF_KIND_ENUM);
>         if (!ASSERT_GT(id, 0, "bpf_max_tramp_links id"))
>                 return -1;
>         t = btf__type_by_id(btf, id);
>         if (!ASSERT_OK_PTR(t, "bpf_max_tramp_links type"))
>                 return -1;
>         if (!ASSERT_EQ(btf_vlen(t), 1, "bpf_max_tramp_links vlen"))
>                 return -1;
>         e = btf_enum(t);
>         if (!ASSERT_OK_PTR(e, "bpf_max_tramp_links[0]"))
>                 return -1;
>         name = btf__name_by_offset(btf, e->name_off);
>         if (!ASSERT_OK_PTR(name, "bpf_max_tramp_links[0].name_off") &&
>             !ASSERT_STREQ(name, "BPF_MAX_TRAMP_LINKS",
> "BPF_MAX_TRAMP_LINKS"))
>                 return -1;
>
>         return e->val;
> }
>
> Is there a way to bypass looking up the enum, and go straight for the
> named member?


don't use btf__find_by_name_kind, just iterate all types and look at
all anonymous enums and its values, roughly

for (i = 1; i < btf__type_cnt(btf); i++) {
    const btf_type *t = btf__type_by_id(i);
    if (!btf_is_enum(t) || t->name_off)
        continue;
    for (j = 0; j < btf_vlen(t); j++) {
        if (strcmp(btf__str_by_offset(btf, btf_enum(t)[j].name_off),
"BPF_MAX_TRAMP_LINKS") != 0)
            continue;
        /* found it */
    }
}

but cleaner :)


>
> > > #if defined(__s390x__)
> > >         BPF_MAX_TRAMP_LINKS = 27,
> > > #else
> > >         BPF_MAX_TRAMP_LINKS = 38,
> > > #endif
> > > };
> > >
> > > - An alternative might be to expose this via /proc, since the users
> > >   might be interested in it too.
> >
> > I'd say let's not, there is no need, having it in BTF is more than
> > enough for testing purposes
>
> Fair enough.
> >