[PATCH v1 bpf-next 0/4] Descend into struct, array types when searching for fields

Dave Marchevsky <davemarchevsky@xxxxxx> · Mon, 23 Oct 2023 15:00:26 -0700

Consider the following BPF program:

  struct val {
    int d;
    struct some_kptr __kptr *ref_ptr;
  };

  struct array_map {
          __uint(type, BPF_MAP_TYPE_ARRAY);
          __type(key, int);
          __type(value, struct val);
          __uint(max_entries, 10);
  } array_map SEC(".maps");

  __hidden struct val global_array[25];

  void bpf_prog(void *ctx)
  {
    struct val *map_val;
    struct some_kptr *p;
    int idx = 0;

    map_val = bpf_map_lookup_elem(&array_map, &idx);
    if (!map_val) { /* omitted */ }
    p = acquire_some_kptr();
    if (!p) { /* omitted */ }

    bpf_kptr_xchg(&map_val->ref_ptr, p);
    bpf_kptr_xchg(&global_array[20].ref_ptr, p);
  }

One would expect both bpf_kptr_xchg's to be possible, but currently
only the first one works. From BPF program writer's perspective this
is unexpected - the array map is an array with struct val elements, so
is global_array, so why the difference in behavior? The confusion is
not hypothetical - we stumbled onto this confusing situation while
developing scheduling BPF programs for the sched_ext project [0].

The key difference is that libbpf-style global vars like global_array
are implemented as single-elem MAP_TYPE_ARRAYs with the value type
holding the vars. e.g. from the verifier's perspective, our global
vars have a value type like

  struct bss_map_value {
    struct var global_array[25];
  };

Note: In BTF it's actually a DATASEC with VARs instead of a STRUCT
with some members. For brevity's sake, when this series refers to the
toplevel type being searched for fields, 'struct' is shorthand for
{struct, datasec} and similarly 'struct members' for {struct members,
datasec vars}.

When creating a map, the verifier looks for special fields in the
value type. Currently, only the type's direct members are examined.
For example, struct val above has ref_ptr field with a special tag,
while bss_map_value's global_array field is has the uninteresting
BTF_KIND_ARRAY type. The verifier doesn't currently examine the
array's elem type - struct var - to look for special fields. The same
issue occurs when examining struct fields as well, e.g. if struct val
were instead:

  struct val2 {
    int d;
    struct some_kptr __kptr *ref_ptr;
  };

  struct val {
    int x;
    struct val2 v;
  };

then xchg'ing with array_map would look like

  bpf_kptr_xchg(&mapval->v.ref_ptr, p);

and would fail for the same reason: as far as verifier is concerned,
struct val doesn't have a kptr field at that offset.

This series adds logic to descend into aggregate types while searching
for special fields. The meat of said search logic occurs in
btf_find_field and its helpers. btf_find_field returns a list of
struct btf_field_info's with information about each field's properties
and location. If the search finds btf_field_infos while descending
into an aggregate type, they're "flattened" into the btf_field_info
list. "Flattening" here means that these two types will have the same
btf_field_info list:

  struct x {
    int a;
    struct some_kptr __kptr *inner[100];
  };

  struct y {
    int a;
    struct some_kptr __kptr *inner_1;
    struct some_kptr __kptr *inner_2;
    /* inner 3-99 snipped */
    struct some_kptr __kptr *inner_100;
  };

Namely, struct x's btf_find_field search will produce btf_field_info
list as if it were struct y. There's no indication of hierarchy or
nestedness in struct x's btf_field_info list, just (offset, type)
pairs as if struct x directly contained 100 kptrs.

Multi-dimensional arrays are supported as well e.g.

  struct z {
    int a;
    struct some_kptr __kptr *inner[10][10];
  }

will have the same btf_field_infos as struct y above.

This flattening approach allows us to find fields in nested aggregate
types without recreating the BTF type graph, which is something we
want to avoid. If we instead modified btf_find_field to return a
hierarchical representation of field info, we'd also have to add
functionality similar to btf_struct_walk to the query path for
btf_records. The goal of btf_record is to answer questions about
special fields more efficiently than walking BTF type graph, so
keeping things simple and flat is preferable.

Some field types expect to be in the same struct as another field,
which doesn't play well with a flattened btf_field_info list. Consider
this example:

  struct root_and_lock {
    struct bpf_rb_root r;
    struct bpf_spin_lock l;
  };

  __hidden struct root_and_lock global_array[10];

global_array's btf_field_infos and btf_record will indicate that it
has 10 bpf_spin_lock fields, which will cause btf_parse_fields to fail
and no btf_record to be associated with global_array. In the future we
can modify struct btf_field_info to explicitly tie a specific
bpf_spin_lock field to a bpf_rb_root.

For fields without such expectations, though, this series' approach
works. Our current usecase for this is to find nested kptrs only, so
nested struct digging is only enabled to look for kptr fields in this
series. We can also consider enabling digging for bpf_{rb,list}_node
fields.

  [0]: https://lore.kernel.org/bpf/20230711011412.100319-1-tj@xxxxxxxxxx/

Dave Marchevsky (4):
  bpf: Fix btf_get_field_type to fail for multiple bpf_refcount fields
  bpf: Refactor btf_find_field with btf_field_info_search
  btf: Descend into structs and arrays during special field search
  selftests/bpf: Add tests exercising aggregate type BTF field search

 include/linux/bpf.h                           |   4 +-
 kernel/bpf/btf.c                              | 483 +++++++++++++-----
 .../selftests/bpf/prog_tests/array_kptr.c     |  12 +
 .../testing/selftests/bpf/progs/array_kptr.c  | 179 +++++++
 4 files changed, 550 insertions(+), 128 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/array_kptr.c
 create mode 100644 tools/testing/selftests/bpf/progs/array_kptr.c

-- 
2.34.1