On Tue, Nov 02, 2021 at 02:14:29AM +0000, Joe Burton wrote: > From: Joe Burton <jevburton@xxxxxxxxxx> > > This is the third version of a patch series implementing map tracing. > > Map tracing enables executing BPF programs upon BPF map updates. This > might be useful to perform upgrades of stateful programs; e.g., tracing > programs can propagate changes to maps that occur during an upgrade > operation. > > This version uses trampoline hooks to provide the capability. > fentry/fexit/fmod_ret programs can attach to two new functions: > int bpf_map_trace_update_elem(struct bpf_map* map, void* key, > void* val, u32 flags); > int bpf_map_trace_delete_elem(struct bpf_map* map, void* key); > > These hooks work as intended for the following map types: > BPF_MAP_TYPE_ARRAY > BPF_MAP_TYPE_PERCPU_ARRAY > BPF_MAP_TYPE_HASH > BPF_MAP_TYPE_PERCPU_HASH > BPF_MAP_TYPE_LRU_HASH > BPF_MAP_TYPE_LRU_PERCPU_HASH > > The only guarantee about the semantics of these hooks is that they execute > before the operation takes place. We cannot call them with locks held > because the hooked program might try to acquire the same locks. Thus they > may be invoked in situations where the traced map is not ultimately > updated. > > The original proposal suggested exposing a function for each > (map type) x (access type). The problem I encountered is that e.g. > percpu hashtables use a custom function for some access types > (htab_percpu_map_update_elem) but a common function for others > (htab_map_delete_elem). Thus a userspace application would have to > maintain a unique list of functions to attach to for each map type; > moreover, this list could change across kernel versions. Map tracing is > easier to use with fewer functions, at the cost of tracing programs > being triggered more times. Good point about htab_percpu. The patches look good to me. Few minor bits: - pls don't use #pragma once. There was a discussion not too long ago about it and the conclusion was that let's not use it. It slipped into few selftest/bpf, but let's not introduce more users. - noinline is not needed in prototype. - bpf_probe_read is deprecated. Pls use bpf_probe_read_kernel. and thanks for detailed patch 3. > To prevent the compiler from optimizing out the calls to my tracing > functions, I use the asm("") trick described in gcc's > __attribute__((noinline)) documentation. Experimentally, this trick > works with clang as well. I think noinline is enough. I don't think you need that asm in there. In parallel let's figure out how to do: SEC("fentry/bpf_map_trace_update_elem") int BPF_PROG(copy_on_write__update, struct bpf_map *map, struct allow_reads_key__old *key, void *value, u64 map_flags) It kinda sucks that bpf_probe_read_kernel is necessary to read key/values. It would be much nicer to be able to specify the exact struct for the key and access it directly. The verifier does this already for map iterator. It's 'void *' on the kernel side while iterator prog can cast this pointer to specific 'struct key *' and access it directly. See bpf_iter_reg->ctx_arg_info and btf_ctx_access(). For fentry into bpf_map_trace_update_elem it's a bit more challenging, since it will be called for all maps and there is no way to statically check that specific_map->key_size is within prog->aux->max_rdonly_access. May be we can do a dynamic cast helper (simlar to those that cast sockets) that will check for key_size at run-time? Another alternative is to allow 'void *' -> PTR_TO_BTF_ID conversion and let inlined probe_read do the job.