Re: [PATCH bpf-next 04/12] libbpf: Add btf enum64 support

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Wed, 11 May 2022 10:43:39 -0700

On Tue, May 10, 2022 at 5:39 PM Yonghong Song <yhs@xxxxxx> wrote:
>
>
>
> On 5/10/22 4:38 PM, Andrii Nakryiko wrote:
> > On Tue, May 10, 2022 at 3:40 PM Yonghong Song <yhs@xxxxxx> wrote:
> >>
> >>
> >>
> >> On 5/9/22 4:25 PM, Andrii Nakryiko wrote:
> >>> On Sun, May 1, 2022 at 12:00 PM Yonghong Song <yhs@xxxxxx> wrote:
> >>>>
> >>>> Add BTF_KIND_ENUM64 support. Deprecated btf__add_enum() and
> >>>> btf__add_enum_value() and introduced the following new APIs
> >>>>     btf__add_enum32()
> >>>>     btf__add_enum32_value()
> >>>>     btf__add_enum64()
> >>>>     btf__add_enum64_value()
> >>>> due to new kind and introduction of kflag.
> >>>>
> >>>> To support old kernel with enum64, the sanitization is
> >>>> added to replace BTF_KIND_ENUM64 with a bunch of
> >>>> pointer-to-void types.
> >>>>
> >>>> The enum64 value relocation is also supported. The enum64
> >>>> forward resolution, with enum type as forward declaration
> >>>> and enum64 as the actual definition, is also supported.
> >>>>
> >>>> Signed-off-by: Yonghong Song <yhs@xxxxxx>
> >>>> ---
> >>>>    tools/lib/bpf/btf.c                           | 226 +++++++++++++++++-
> >>>>    tools/lib/bpf/btf.h                           |  21 ++
> >>>>    tools/lib/bpf/btf_dump.c                      |  94 ++++++--
> >>>>    tools/lib/bpf/libbpf.c                        |  64 ++++-
> >>>>    tools/lib/bpf/libbpf.map                      |   4 +
> >>>>    tools/lib/bpf/libbpf_internal.h               |   2 +
> >>>>    tools/lib/bpf/linker.c                        |   2 +
> >>>>    tools/lib/bpf/relo_core.c                     |  93 ++++---
> >>>>    .../selftests/bpf/prog_tests/btf_dump.c       |  10 +-
> >>>>    .../selftests/bpf/prog_tests/btf_write.c      |   6 +-
> >>>>    10 files changed, 450 insertions(+), 72 deletions(-)
> >>>>
> >>>
> >
> > [...]
> >
> >>>
> >>>
> >>>> +       t->size = tsize;
> >>>> +
> >>>> +       return btf_commit_type(btf, sz);
> >>>> +}
> >>>> +
> >>>> +/*
> >>>> + * Append new BTF_KIND_ENUM type with:
> >>>> + *   - *name* - name of the enum, can be NULL or empty for anonymous enums;
> >>>> + *   - *is_unsigned* - whether the enum values are unsigned or not;
> >>>> + *
> >>>> + * Enum initially has no enum values in it (and corresponds to enum forward
> >>>> + * declaration). Enumerator values can be added by btf__add_enum64_value()
> >>>> + * immediately after btf__add_enum() succeeds.
> >>>> + *
> >>>> + * Returns:
> >>>> + *   - >0, type ID of newly added BTF type;
> >>>> + *   - <0, on error.
> >>>> + */
> >>>> +int btf__add_enum32(struct btf *btf, const char *name, bool is_unsigned)
> >>>
> >>> given it's still BTF_KIND_ENUM in UAPI, let's keep 32-bit ones as just
> >>> btf__add_enum()/btf__add_enum_value() and not deprecate anything.
> >>> ENUM64 can be thought about as more of a special case, so I think it's
> >>> ok.
> >>
> >> The current btf__add_enum api:
> >> LIBBPF_API int btf__add_enum(struct btf *btf, const char *name, __u32
> >> bytes_sz);
> >>
> >> The issue is it doesn't have signedness parameter. if the user input
> >> is
> >>      enum { A = -1, B = 0, C = 1 };
> >> the actual printout btf format will be
> >>      enum { A 4294967295, B = 0, C = 1}
> >> does not match the original source.
> >
> > Oh, I didn't realize that's the reason. I still like btf__add_enum()
> > name much better, can you please do the same macro trick that I did
> > for bpf_prog_load() based on the number of arguments? We'll be able to
> > preserve good API name and add extra argument. Once this lands we'll
> > need to update pahole to added signedness bit, but otherwise I don't
> > think there are many other users of these APIs currently (I might be
> > wrong, but macro magic gives us backwards compat anyway).
> >
> >>
> >>>
> >>>> +{
> >>>> +       return btf_add_enum_common(btf, name, is_unsigned, BTF_KIND_ENUM, 4);
> >>>> +}
> >>>> +
> >>>
> >>> [...]
> >>>
> >>>>    /*
> >
> > [...]
> >
> >>>> @@ -764,8 +792,13 @@ static int bpf_core_calc_enumval_relo(const struct bpf_core_relo *relo,
> >>>>                   if (!spec)
> >>>>                           return -EUCLEAN; /* request instruction poisoning */
> >>>>                   t = btf_type_by_id(spec->btf, spec->spec[0].type_id);
> >>>> -               e = btf_enum(t) + spec->spec[0].idx;
> >>>> -               *val = e->val;
> >>>> +               if (btf_is_enum(t)) {
> >>>> +                       e = btf_enum(t) + spec->spec[0].idx;
> >>>> +                       *val = e->val;
> >>>> +               } else {
> >>>> +                       e64 = btf_enum64(t) + spec->spec[0].idx;
> >>>> +                       *val = btf_enum64_value(e64);
> >>>> +               }
> >>>
> >>> I think with sign bit we now have further complication: for 32-bit
> >>> enums we need to sign extend 32-bit values to s64 and then cast as
> >>> u64, no? Seems like a helper to abstract that is good to have here.
> >>> Otherwise relocating enum ABC { D = -1 } will produce invalid ldimm64
> >>> instruction, right?
> >>
> >> We should be fine here. For enum32, we have
> >> struct btf_enum {
> >>           __u32   name_off;
> >>           __s32   val;
> >> };
> >> So above *val = e->val will first sign extend from __s32 to __s64
> >> and then the __u64. Let me have a helper with additional comments
> >> to make it clear.
> >>
> >
> > Ok, great! Let's just shorten this as I suggested below?
>
> The
>  >>> *val = btf_is_enum(t)
>  >>>       ? btf_enum(t)[spec->spec[0].idx]
>  >>>       : btf_enum64(t)[spec->spec[0].idx];
> won't work, but the following should work:
>     *val = btf_is_enum(t)
>         ? btf_enum(t)[spec->spec[0].idx].val
>         : btf_enum64_value(btf_enum64(t) + spec->spec[0].idx);

yep, for consistency it should be btf_enum64(t)[spec->spec[0].idx],
but it's very minor, of course

> >
> >>>
> >>> Also keep in mind that you can use btf_enum()/btf_enum64() as an
> >>> array, so above you can write just as
> >>>
> >>> *val = btf_is_enum(t)
> >>>       ? btf_enum(t)[spec->spec[0].idx]
> >>>       : btf_enum64(t)[spec->spec[0].idx];
> >>>
> >>> But we need sign check and extension, so better to have a separate helper.
> >>>
> >>>>                   break;
> >>>>           default:
> >>>>                   return -EOPNOTSUPP;
> >>>> @@ -1034,7 +1067,7 @@ int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn,
> >>>>                   }
> >>>>
> >>>>                   insn[0].imm = new_val;
> >>>> -               insn[1].imm = 0; /* currently only 32-bit values are supported */
> >>>> +               insn[1].imm = new_val >> 32;
> >>>
> >>> for 32-bit instructions (ALU/ALU32, etc) we need to make sure that
> >>> new_val fits in 32 bits. And we need to be careful about
> >>> signed/unsigned, because for signed case all-zero or all-one upper 32
> >>> bits are ok (sign extension). Can we know the expected signed/unsigned
> >>> operation from bpf_insn itself? We should be, right?
> >>
> >> The core relocation insn for constant is
> >>     move r1, <32bit value>
> >> or
> >>     ldimm_64 r1, <64bit value>
> >> and there are no signedness information.
> >> So the 64bit value (except sign extension) can only from
> >> ldimm_64. We should be okay here, but I can double check.
> >
> > not sure how full 64-bit -1 should be loaded into register then. Does
> > compiler generate extra sign-extending bit shifts or embedded constant
> > is considered to be a signed constant always?
>
> For ldimm64 r1, -1,
> the first insn imm will be 0xffffffff, and the second insn will also be
> 0xffffffff. The final value will be
>    ((u64)(u32)0xffffffff << 32) | (u32)0xffffffff

yeah, I get it for ldimm64, but I was specifically curious about move
instruction that only has 32-bit immediate value but assigns to full
64-bit r1? Is it treated as signed unconditionally?

>
>
> >
> >>
> >>>
> >>>>                   pr_debug("prog '%s': relo #%d: patched insn #%d (LDIMM64) imm64 %llu -> %llu\n",
> >>>>                            prog_name, relo_idx, insn_idx,
> >>>>                            (unsigned long long)imm, new_val);
> >>>> @@ -1056,6 +1089,7 @@ int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn,
> >>>>     */
> >
> > [...]