Consider the following BPF program: struct val { int d; struct some_kptr __kptr *ref_ptr; }; struct array_map { __uint(type, BPF_MAP_TYPE_ARRAY); __type(key, int); __type(value, struct val); __uint(max_entries, 10); } array_map SEC(".maps"); __hidden struct val global_array[25]; void bpf_prog(void *ctx) { struct val *map_val; struct some_kptr *p; int idx = 0; map_val = bpf_map_lookup_elem(&array_map, &idx); if (!map_val) { /* omitted */ } p = acquire_some_kptr(); if (!p) { /* omitted */ } bpf_kptr_xchg(&map_val->ref_ptr, p); bpf_kptr_xchg(&global_array[20].ref_ptr, p); } One would expect both bpf_kptr_xchg's to be possible, but currently only the first one works. From BPF program writer's perspective this is unexpected - the array map is an array with struct val elements, so is global_array, so why the difference in behavior? The confusion is not hypothetical - we stumbled onto this confusing situation while developing scheduling BPF programs for the sched_ext project [0]. The key difference is that libbpf-style global vars like global_array are implemented as single-elem MAP_TYPE_ARRAYs with the value type holding the vars. e.g. from the verifier's perspective, our global vars have a value type like struct bss_map_value { struct var global_array[25]; }; Note: In BTF it's actually a DATASEC with VARs instead of a STRUCT with some members. For brevity's sake, when this series refers to the toplevel type being searched for fields, 'struct' is shorthand for {struct, datasec} and similarly 'struct members' for {struct members, datasec vars}. When creating a map, the verifier looks for special fields in the value type. Currently, only the type's direct members are examined. For example, struct val above has ref_ptr field with a special tag, while bss_map_value's global_array field is has the uninteresting BTF_KIND_ARRAY type. The verifier doesn't currently examine the array's elem type - struct var - to look for special fields. The same issue occurs when examining struct fields as well, e.g. if struct val were instead: struct val2 { int d; struct some_kptr __kptr *ref_ptr; }; struct val { int x; struct val2 v; }; then xchg'ing with array_map would look like bpf_kptr_xchg(&mapval->v.ref_ptr, p); and would fail for the same reason: as far as verifier is concerned, struct val doesn't have a kptr field at that offset. This series adds logic to descend into aggregate types while searching for special fields. The meat of said search logic occurs in btf_find_field and its helpers. btf_find_field returns a list of struct btf_field_info's with information about each field's properties and location. If the search finds btf_field_infos while descending into an aggregate type, they're "flattened" into the btf_field_info list. "Flattening" here means that these two types will have the same btf_field_info list: struct x { int a; struct some_kptr __kptr *inner[100]; }; struct y { int a; struct some_kptr __kptr *inner_1; struct some_kptr __kptr *inner_2; /* inner 3-99 snipped */ struct some_kptr __kptr *inner_100; }; Namely, struct x's btf_find_field search will produce btf_field_info list as if it were struct y. There's no indication of hierarchy or nestedness in struct x's btf_field_info list, just (offset, type) pairs as if struct x directly contained 100 kptrs. Multi-dimensional arrays are supported as well e.g. struct z { int a; struct some_kptr __kptr *inner[10][10]; } will have the same btf_field_infos as struct y above. This flattening approach allows us to find fields in nested aggregate types without recreating the BTF type graph, which is something we want to avoid. If we instead modified btf_find_field to return a hierarchical representation of field info, we'd also have to add functionality similar to btf_struct_walk to the query path for btf_records. The goal of btf_record is to answer questions about special fields more efficiently than walking BTF type graph, so keeping things simple and flat is preferable. Some field types expect to be in the same struct as another field, which doesn't play well with a flattened btf_field_info list. Consider this example: struct root_and_lock { struct bpf_rb_root r; struct bpf_spin_lock l; }; __hidden struct root_and_lock global_array[10]; global_array's btf_field_infos and btf_record will indicate that it has 10 bpf_spin_lock fields, which will cause btf_parse_fields to fail and no btf_record to be associated with global_array. In the future we can modify struct btf_field_info to explicitly tie a specific bpf_spin_lock field to a bpf_rb_root. For fields without such expectations, though, this series' approach works. Our current usecase for this is to find nested kptrs only, so nested struct digging is only enabled to look for kptr fields in this series. We can also consider enabling digging for bpf_{rb,list}_node fields. [0]: https://lore.kernel.org/bpf/20230711011412.100319-1-tj@xxxxxxxxxx/ Dave Marchevsky (4): bpf: Fix btf_get_field_type to fail for multiple bpf_refcount fields bpf: Refactor btf_find_field with btf_field_info_search btf: Descend into structs and arrays during special field search selftests/bpf: Add tests exercising aggregate type BTF field search include/linux/bpf.h | 4 +- kernel/bpf/btf.c | 483 +++++++++++++----- .../selftests/bpf/prog_tests/array_kptr.c | 12 + .../testing/selftests/bpf/progs/array_kptr.c | 179 +++++++ 4 files changed, 550 insertions(+), 128 deletions(-) create mode 100644 tools/testing/selftests/bpf/prog_tests/array_kptr.c create mode 100644 tools/testing/selftests/bpf/progs/array_kptr.c -- 2.34.1