Ping? On Thu, Apr 9, 2020 at 12:26 PM Daniel T. Lee <danieltimlee@xxxxxxxxx> wrote: > > Currently, BPF program attached on BPF map function (read,write) is not called. > To be specific, the bpf kprobe program on 'htab_map_get_next_key' > doesn't called at all. To test this behavior, you can try ./tracex6 > from the 'samples/bpf'. (It does not work properly at all) > > By using 'git bisect', found the problem is derived from below commit.(v5.0-rc3) > commit 7c4cd051add3 ("bpf: Fix syscall's stackmap lookup potential deadlock") > The code below is an excerpt of only the problematic code from the entire code. > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > index b155cd17c1bd..8577bb7f8be6 100644 > --- a/kernel/bpf/syscall.c > +++ b/kernel/bpf/syscall.c > @@ -713,8 +713,13 @@ static int map_lookup_elem(union bpf_attr *attr) > > if (bpf_map_is_dev_bound(map)) { > err = bpf_map_offload_lookup_elem(map, key, value); > goto done; > } > > preempt_disable(); > + this_cpu_inc(bpf_prog_active); > if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || > map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) { > err = bpf_percpu_hash_copy(map, key, value); > } else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) { > err = bpf_percpu_array_copy(map, key, value); > @@ -744,7 +749,10 @@ static int map_lookup_elem(union bpf_attr *attr) > } > rcu_read_unlock(); > } > + this_cpu_dec(bpf_prog_active); > preempt_enable(); > > done: > if (err) > goto free_value; > > As you can see from this snippet, bpf_prog_active value (flag I guess?) > increases and decreases within the code snippet. And this action create a > problem where bpf program on map is not called. > > # kernel/trace/bpf_trace.c:74 > unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx) > { > ... > preempt_disable(); > > if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) { > /* > * since some bpf program is already running on this cpu, > * don't call into another bpf program (same or different) > * and don't send kprobe event into ring-buffer, > * so return zero here > */ > ret = 0; > goto out; > } > ... > ret = BPF_PROG_RUN_ARRAY_CHECK(call->prog_array, ctx, BPF_PROG_RUN); > > out: > __this_cpu_dec(bpf_prog_active); > preempt_enable(); > > > So from trace_call_bpf() at kernel/trace/bpf_trace.c check whether > bpf_prog_active is 1, and if it is, it skips the execution of bpf program. > > Back to latest Kernel 5.6, this this_cpu_{inc|dec}() has been wrapped with > bpf_{enable|disable}_instrumentation(). > > # include/linux/bpf.h > static inline void bpf_enable_instrumentation(void) > { > if (IS_ENABLED(CONFIG_PREEMPT_RT)) > this_cpu_dec(bpf_prog_active); > else > __this_cpu_dec(bpf_prog_active); > migrate_enable(); > } > > And the functions which uses this wrapper are described below. > > bpf_map_update_value > bpf_map_copy_value > map_delete_elem > generic_map_delete_batch > > Which is basically most of the map operation. > > So, I think this 'unable to attach bpf program on BPF map function (read,write)' > is a bug. Or is it desired action? > > If it is a bug, bpf_{enable|disable}_instrumentation() should only > cover stackmap > as the upper commit intended. Not sure but adding another flag for > lock might work? > > Or if this is an desired action, this should be covered at > documentation with a limitation > and tracex6 sample has to be removed.