Re: [PATCH bpf-next 04/12] libbpf: Add btf enum64 support

Yonghong Song <yhs@xxxxxx> · Wed, 11 May 2022 11:56:09 -0700






On 5/11/22 10:43 AM, Andrii Nakryiko wrote:
On Tue, May 10, 2022 at 5:39 PM Yonghong Song <yhs@xxxxxx> wrote:



On 5/10/22 4:38 PM, Andrii Nakryiko wrote:
On Tue, May 10, 2022 at 3:40 PM Yonghong Song <yhs@xxxxxx> wrote:



On 5/9/22 4:25 PM, Andrii Nakryiko wrote:
On Sun, May 1, 2022 at 12:00 PM Yonghong Song <yhs@xxxxxx> wrote:

Add BTF_KIND_ENUM64 support. Deprecated btf__add_enum() and
btf__add_enum_value() and introduced the following new APIs
     btf__add_enum32()
     btf__add_enum32_value()
     btf__add_enum64()
     btf__add_enum64_value()
due to new kind and introduction of kflag.

To support old kernel with enum64, the sanitization is
added to replace BTF_KIND_ENUM64 with a bunch of
pointer-to-void types.

The enum64 value relocation is also supported. The enum64
forward resolution, with enum type as forward declaration
and enum64 as the actual definition, is also supported.

Signed-off-by: Yonghong Song <yhs@xxxxxx>
---
    tools/lib/bpf/btf.c                           | 226 +++++++++++++++++-
    tools/lib/bpf/btf.h                           |  21 ++
    tools/lib/bpf/btf_dump.c                      |  94 ++++++--
    tools/lib/bpf/libbpf.c                        |  64 ++++-
    tools/lib/bpf/libbpf.map                      |   4 +
    tools/lib/bpf/libbpf_internal.h               |   2 +
    tools/lib/bpf/linker.c                        |   2 +
    tools/lib/bpf/relo_core.c                     |  93 ++++---
    .../selftests/bpf/prog_tests/btf_dump.c       |  10 +-
    .../selftests/bpf/prog_tests/btf_write.c      |   6 +-
    10 files changed, 450 insertions(+), 72 deletions(-)



[...]



+       t->size = tsize;
+
+       return btf_commit_type(btf, sz);
+}
+
+/*
+ * Append new BTF_KIND_ENUM type with:
+ *   - *name* - name of the enum, can be NULL or empty for anonymous enums;
+ *   - *is_unsigned* - whether the enum values are unsigned or not;
+ *
+ * Enum initially has no enum values in it (and corresponds to enum forward
+ * declaration). Enumerator values can be added by btf__add_enum64_value()
+ * immediately after btf__add_enum() succeeds.
+ *
+ * Returns:
+ *   - >0, type ID of newly added BTF type;
+ *   - <0, on error.
+ */
+int btf__add_enum32(struct btf *btf, const char *name, bool is_unsigned)

given it's still BTF_KIND_ENUM in UAPI, let's keep 32-bit ones as just
btf__add_enum()/btf__add_enum_value() and not deprecate anything.
ENUM64 can be thought about as more of a special case, so I think it's
ok.

The current btf__add_enum api:
LIBBPF_API int btf__add_enum(struct btf *btf, const char *name, __u32
bytes_sz);

The issue is it doesn't have signedness parameter. if the user input
is
      enum { A = -1, B = 0, C = 1 };
the actual printout btf format will be
      enum { A 4294967295, B = 0, C = 1}
does not match the original source.

Oh, I didn't realize that's the reason. I still like btf__add_enum()
name much better, can you please do the same macro trick that I did
for bpf_prog_load() based on the number of arguments? We'll be able to
preserve good API name and add extra argument. Once this lands we'll
need to update pahole to added signedness bit, but otherwise I don't
think there are many other users of these APIs currently (I might be
wrong, but macro magic gives us backwards compat anyway).



+{
+       return btf_add_enum_common(btf, name, is_unsigned, BTF_KIND_ENUM, 4);
+}
+

[...]

    /*

[...]

@@ -764,8 +792,13 @@ static int bpf_core_calc_enumval_relo(const struct bpf_core_relo *relo,
                   if (!spec)
                           return -EUCLEAN; /* request instruction poisoning */
                   t = btf_type_by_id(spec->btf, spec->spec[0].type_id);
-               e = btf_enum(t) + spec->spec[0].idx;
-               *val = e->val;
+               if (btf_is_enum(t)) {
+                       e = btf_enum(t) + spec->spec[0].idx;
+                       *val = e->val;
+               } else {
+                       e64 = btf_enum64(t) + spec->spec[0].idx;
+                       *val = btf_enum64_value(e64);
+               }

I think with sign bit we now have further complication: for 32-bit
enums we need to sign extend 32-bit values to s64 and then cast as
u64, no? Seems like a helper to abstract that is good to have here.
Otherwise relocating enum ABC { D = -1 } will produce invalid ldimm64
instruction, right?

We should be fine here. For enum32, we have
struct btf_enum {
           __u32   name_off;
           __s32   val;
};
So above *val = e->val will first sign extend from __s32 to __s64
and then the __u64. Let me have a helper with additional comments
to make it clear.


Ok, great! Let's just shorten this as I suggested below?

The
  >>> *val = btf_is_enum(t)
  >>>       ? btf_enum(t)[spec->spec[0].idx]
  >>>       : btf_enum64(t)[spec->spec[0].idx];
won't work, but the following should work:
     *val = btf_is_enum(t)
         ? btf_enum(t)[spec->spec[0].idx].val
         : btf_enum64_value(btf_enum64(t) + spec->spec[0].idx);

yep, for consistency it should be btf_enum64(t)[spec->spec[0].idx],
but it's very minor, of course



Also keep in mind that you can use btf_enum()/btf_enum64() as an
array, so above you can write just as

*val = btf_is_enum(t)
       ? btf_enum(t)[spec->spec[0].idx]
       : btf_enum64(t)[spec->spec[0].idx];

But we need sign check and extension, so better to have a separate helper.

                   break;
           default:
                   return -EOPNOTSUPP;
@@ -1034,7 +1067,7 @@ int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn,
                   }

                   insn[0].imm = new_val;
-               insn[1].imm = 0; /* currently only 32-bit values are supported */
+               insn[1].imm = new_val >> 32;

for 32-bit instructions (ALU/ALU32, etc) we need to make sure that
new_val fits in 32 bits. And we need to be careful about
signed/unsigned, because for signed case all-zero or all-one upper 32
bits are ok (sign extension). Can we know the expected signed/unsigned
operation from bpf_insn itself? We should be, right?

The core relocation insn for constant is
     move r1, <32bit value>
or
     ldimm_64 r1, <64bit value>
and there are no signedness information.
So the 64bit value (except sign extension) can only from
ldimm_64. We should be okay here, but I can double check.

not sure how full 64-bit -1 should be loaded into register then. Does
compiler generate extra sign-extending bit shifts or embedded constant
is considered to be a signed constant always?

For ldimm64 r1, -1,
the first insn imm will be 0xffffffff, and the second insn will also be
0xffffffff. The final value will be
    ((u64)(u32)0xffffffff << 32) | (u32)0xffffffff

yeah, I get it for ldimm64, but I was specifically curious about move
instruction that only has 32-bit immediate value but assigns to full
64-bit r1? Is it treated as signed unconditionally?

Yes, it is treated as 32-bit signed int and will do sign extension
if needed.







                   pr_debug("prog '%s': relo #%d: patched insn #%d (LDIMM64) imm64 %llu -> %llu\n",
                            prog_name, relo_idx, insn_idx,
                            (unsigned long long)imm, new_val);
@@ -1056,6 +1089,7 @@ int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn,
     */

[...]