BPF program attached on BPF map function (read,write) is not working?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Currently, BPF program attached on BPF map function (read,write) is not called.
To be specific, the bpf kprobe program on 'htab_map_get_next_key'
doesn't called at all. To test this behavior, you can try ./tracex6
from the 'samples/bpf'. (It does not work properly at all)

By using 'git bisect', found the problem is derived from below commit.(v5.0-rc3)
commit 7c4cd051add3 ("bpf: Fix syscall's stackmap lookup potential deadlock")
The code below is an excerpt of only the problematic code from the entire code.

   diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
   index b155cd17c1bd..8577bb7f8be6 100644
   --- a/kernel/bpf/syscall.c
   +++ b/kernel/bpf/syscall.c
   @@ -713,8 +713,13 @@ static int map_lookup_elem(union bpf_attr *attr)

           if (bpf_map_is_dev_bound(map)) {
                   err = bpf_map_offload_lookup_elem(map, key, value);
                   goto done;
           }

           preempt_disable();
   +      this_cpu_inc(bpf_prog_active);
           if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH ||
               map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) {
                   err = bpf_percpu_hash_copy(map, key, value);
           } else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) {
                   err = bpf_percpu_array_copy(map, key, value);
   @@ -744,7 +749,10 @@ static int map_lookup_elem(union bpf_attr *attr)
                   }
                   rcu_read_unlock();
           }
   +      this_cpu_dec(bpf_prog_active);
           preempt_enable();

   done:
           if (err)
                   goto free_value;

As you can see from this snippet, bpf_prog_active value (flag I guess?)
increases and decreases within the code snippet. And this action create a
problem where bpf program on map is not called.

   # kernel/trace/bpf_trace.c:74
   unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx)
   {
       ...
        preempt_disable();

        if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
                /*
                 * since some bpf program is already running on this cpu,
                 * don't call into another bpf program (same or different)
                 * and don't send kprobe event into ring-buffer,
                 * so return zero here
                 */
                ret = 0;
                goto out;
        }
       ...
       ret = BPF_PROG_RUN_ARRAY_CHECK(call->prog_array, ctx, BPF_PROG_RUN);

   out:
       __this_cpu_dec(bpf_prog_active);
       preempt_enable();


So from trace_call_bpf() at kernel/trace/bpf_trace.c check whether
bpf_prog_active is 1, and if it is, it skips the execution of bpf program.

Back to latest Kernel 5.6, this this_cpu_{inc|dec}() has been wrapped with
bpf_{enable|disable}_instrumentation().

   # include/linux/bpf.h
   static inline void bpf_enable_instrumentation(void)
   {
           if (IS_ENABLED(CONFIG_PREEMPT_RT))
                   this_cpu_dec(bpf_prog_active);
           else
                   __this_cpu_dec(bpf_prog_active);
           migrate_enable();
   }

And the functions which uses this wrapper are described below.

   bpf_map_update_value
   bpf_map_copy_value
   map_delete_elem
   generic_map_delete_batch

Which is basically most of the map operation.

So, I think this 'unable to attach bpf program on BPF map function (read,write)'
is a bug. Or is it desired action?

If it is a bug, bpf_{enable|disable}_instrumentation() should only
cover stackmap
as the upper commit intended. Not sure but adding another flag for
lock might work?

Or if this is an desired action, this should be covered at
documentation with a limitation
and tracex6 sample has to be removed.



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux