Re: [PATCH bpf-next 22/24] s390/bpf: Implement arch_prepare_bpf_trampoline()

Ilya Leoshkevich <iii@xxxxxxxxxxxxx> · Fri, 27 Jan 2023 12:15:15 +0100

On Thu, 2023-01-26 at 11:06 -0800, Andrii Nakryiko wrote:
> On Thu, Jan 26, 2023 at 6:30 AM Ilya Leoshkevich <iii@xxxxxxxxxxxxx>
> wrote:
> > 
> > On Wed, 2023-01-25 at 17:15 -0800, Andrii Nakryiko wrote:
> > > On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich
> > > <iii@xxxxxxxxxxxxx>
> > > wrote:
> > > > 
> > > > arch_prepare_bpf_trampoline() is used for direct attachment of
> > > > eBPF
> > > > programs to various places, bypassing kprobes. It's responsible
> > > > for
> > > > calling a number of eBPF programs before, instead and/or after
> > > > whatever they are attached to.
> > > > 
> > > > Add a s390x implementation, paying attention to the following:
> > > > 
> > > > - Reuse the existing JIT infrastructure, where possible.
> > > > - Like the existing JIT, prefer making multiple passes instead
> > > > of
> > > >   backpatching. Currently 2 passes is enough. If literal pool
> > > > is
> > > >   introduced, this needs to be raised to 3. However, at the
> > > > moment
> > > >   adding literal pool only makes the code larger. If branch
> > > >   shortening is introduced, the number of passes needs to be
> > > >   increased even further.
> > > > - Support both regular and ftrace calling conventions,
> > > > depending on
> > > >   the trampoline flags.
> > > > - Use expolines for indirect calls.
> > > > - Handle the mismatch between the eBPF and the s390x ABIs.
> > > > - Sign-extend fmod_ret return values.
> > > > 
> > > > invoke_bpf_prog() produces about 120 bytes; it might be
> > > > possible to
> > > > slightly optimize this, but reaching 50 bytes, like on x86_64,
> > > > looks
> > > > unrealistic: just loading cookie, __bpf_prog_enter, bpf_func,
> > > > insnsi
> > > > and __bpf_prog_exit as literals already takes at least 5 * 12 =
> > > > 60
> > > > bytes, and we can't use relative addressing for most of them.
> > > > Therefore, lower BPF_MAX_TRAMP_LINKS on s390x.
> > > > 
> > > > Signed-off-by: Ilya Leoshkevich <iii@xxxxxxxxxxxxx>
> > > > ---
> > > >  arch/s390/net/bpf_jit_comp.c | 535
> > > > +++++++++++++++++++++++++++++++++--
> > > >  include/linux/bpf.h          |   4 +
> > > >  2 files changed, 517 insertions(+), 22 deletions(-)
> > > > 
> > > 
> > > [...]
> > > 
> > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > > index cf89504c8dda..52ff43bbf996 100644
> > > > --- a/include/linux/bpf.h
> > > > +++ b/include/linux/bpf.h
> > > > @@ -943,7 +943,11 @@ struct btf_func_model {
> > > >  /* Each call __bpf_prog_enter + call bpf_func + call
> > > > __bpf_prog_exit is ~50
> > > >   * bytes on x86.
> > > >   */
> > > > +#if defined(__s390x__)
> > > > +#define BPF_MAX_TRAMP_LINKS 27
> > > > +#else
> > > >  #define BPF_MAX_TRAMP_LINKS 38
> > > > +#endif
> > > 
> > > if we turn this into enum definition, then on selftests side we
> > > can
> > > just discover this from vmlinux BTF, instead of hard-coding
> > > arch-specific constants. Thoughts?
> > 
> > This seems to work. I can replace 3/24 and 4/24 with that in v2.
> > Some random notes:
> > 
> > - It doesn't seem to be possible to #include "vlinux.h" into tests,
> >   so one has to go through the btf__load_vmlinux_btf() dance and
> >   allocate the fd arrays dynamically.
> 
> yes, you can't include vmlinux.h into user-space code, of course. And
> yes it's true about needing to use btf__load_vmlinux_btf().
> 
> But I didn't get what you are saying about fd arrays, tbh. Can you
> please elaborate?

That's a really minor thing; fexit_fd and and link_fd in fexit_stress
now need to be allocated dynamically.

> > - One has to give this enum an otherwise unnecessary name, so that
> >   it's easy to find. This doesn't seem like a big deal though:
> > 
> > enum bpf_max_tramp_links {
> 
> not really, you can keep it anonymous enum. We do that in
> include/uapi/linux/bpf.h for a lot of constants

How would you find it then? My current code is:

int get_bpf_max_tramp_links_from(struct btf *btf)
{
        const struct btf_enum *e;
        const struct btf_type *t;
        const char *name;
        int id;

        id = btf__find_by_name_kind(btf, "bpf_max_tramp_links",
BTF_KIND_ENUM);
        if (!ASSERT_GT(id, 0, "bpf_max_tramp_links id"))
                return -1;
        t = btf__type_by_id(btf, id);
        if (!ASSERT_OK_PTR(t, "bpf_max_tramp_links type"))
                return -1;
        if (!ASSERT_EQ(btf_vlen(t), 1, "bpf_max_tramp_links vlen"))
                return -1;
        e = btf_enum(t);
        if (!ASSERT_OK_PTR(e, "bpf_max_tramp_links[0]"))
                return -1;
        name = btf__name_by_offset(btf, e->name_off);
        if (!ASSERT_OK_PTR(name, "bpf_max_tramp_links[0].name_off") &&
            !ASSERT_STREQ(name, "BPF_MAX_TRAMP_LINKS",
"BPF_MAX_TRAMP_LINKS"))
                return -1;

        return e->val;
}

Is there a way to bypass looking up the enum, and go straight for the
named member?

> > #if defined(__s390x__)
> >         BPF_MAX_TRAMP_LINKS = 27,
> > #else
> >         BPF_MAX_TRAMP_LINKS = 38,
> > #endif
> > };
> > 
> > - An alternative might be to expose this via /proc, since the users
> >   might be interested in it too.
> 
> I'd say let's not, there is no need, having it in BTF is more than
> enough for testing purposes

Fair enough.
>