RE: [PATCH v2 bpf-next 3/5] bpf: Support access to bpf map fields

John Fastabend <john.fastabend@xxxxxxxxx> · Sat, 20 Jun 2020 20:27:15 -0700

Andrey Ignatov wrote:
> There are multiple use-cases when it's convenient to have access to bpf
> map fields, both `struct bpf_map` and map type specific struct-s such as
> `struct bpf_array`, `struct bpf_htab`, etc.
> 
> For example while working with sock arrays it can be necessary to
> calculate the key based on map->max_entries (some_hash % max_entries).
> Currently this is solved by communicating max_entries via "out-of-band"
> channel, e.g. via additional map with known key to get info about target
> map. That works, but is not very convenient and error-prone while
> working with many maps.
> 
> In other cases necessary data is dynamic (i.e. unknown at loading time)
> and it's impossible to get it at all. For example while working with a
> hash table it can be convenient to know how much capacity is already
> used (bpf_htab.count.counter for BPF_F_NO_PREALLOC case).
> 
> At the same time kernel knows this info and can provide it to bpf
> program.
> 
> Fill this gap by adding support to access bpf map fields from bpf
> program for both `struct bpf_map` and map type specific fields.
> 
> Support is implemented via btf_struct_access() so that a user can define
> their own `struct bpf_map` or map type specific struct in their program
> with only necessary fields and preserve_access_index attribute, cast a
> map to this struct and use a field.
> 
> For example:
> 
> 	struct bpf_map {
> 		__u32 max_entries;
> 	} __attribute__((preserve_access_index));
> 
> 	struct bpf_array {
> 		struct bpf_map map;
> 		__u32 elem_size;
> 	} __attribute__((preserve_access_index));
> 
> 	struct {
> 		__uint(type, BPF_MAP_TYPE_ARRAY);
> 		__uint(max_entries, 4);
> 		__type(key, __u32);
> 		__type(value, __u32);
> 	} m_array SEC(".maps");
> 
> 	SEC("cgroup_skb/egress")
> 	int cg_skb(void *ctx)
> 	{
> 		struct bpf_array *array = (struct bpf_array *)&m_array;
> 		struct bpf_map *map = (struct bpf_map *)&m_array;
> 
> 		/* .. use map->max_entries or array->map.max_entries .. */
> 	}
> 
> Similarly to other btf_struct_access() use-cases (e.g. struct tcp_sock
> in net/ipv4/bpf_tcp_ca.c) the patch allows access to any fields of
> corresponding struct. Only reading from map fields is supported.
> 
> For btf_struct_access() to work there should be a way to know btf id of
> a struct that corresponds to a map type. To get btf id there should be a
> way to get a stringified name of map-specific struct, such as
> "bpf_array", "bpf_htab", etc for a map type. Two new fields are added to
> `struct bpf_map_ops` to handle it:
> * .map_btf_name keeps a btf name of a struct returned by map_alloc();
> * .map_btf_id is used to cache btf id of that struct.
> 
> To make btf ids calculation cheaper they're calculated once while
> preparing btf_vmlinux and cached same way as it's done for btf_id field
> of `struct bpf_func_proto`
> 
> While calculating btf ids, struct names are NOT checked for collision.
> Collisions will be checked as a part of the work to prepare btf ids used
> in verifier in compile time that should land soon. The only known
> collision for `struct bpf_htab` (kernel/bpf/hashtab.c vs
> net/core/sock_map.c) was fixed earlier.
> 
> Both new fields .map_btf_name and .map_btf_id must be set for a map type
> for the feature to work. If neither is set for a map type, verifier will
> return ENOTSUPP on a try to access map_ptr of corresponding type. If
> just one of them set, it's verifier misconfiguration.
> 
> Only `struct bpf_array` for BPF_MAP_TYPE_ARRAY and `struct bpf_htab` for
> BPF_MAP_TYPE_HASH are supported by this patch. Other map types will be
> supported separately.
> 
> The feature is available only for CONFIG_DEBUG_INFO_BTF=y and gated by
> perfmon_capable() so that unpriv programs won't have access to bpf map
> fields.
> 
> Signed-off-by: Andrey Ignatov <rdna@xxxxxx>
> ---
>  include/linux/bpf.h                           |  9 ++
>  include/linux/bpf_verifier.h                  |  1 +
>  kernel/bpf/arraymap.c                         |  3 +
>  kernel/bpf/btf.c                              | 40 +++++++++
>  kernel/bpf/hashtab.c                          |  3 +
>  kernel/bpf/verifier.c                         | 82 +++++++++++++++++--
>  .../selftests/bpf/verifier/map_ptr_mixing.c   |  2 +-
>  7 files changed, 131 insertions(+), 9 deletions(-)
> 
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 07052d44bca1..1e1501ee53ce 100644

LGTM, but any reason not to allow this with bpf_capable() it looks
useful for building load balancers which might not be related to
CAP_PERFMON.

Otherwise,

Acked-by: John Fastabend <john.fastabend@xxxxxxxxx>