On Sat, Apr 15, 2023 at 10:17 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > From: "Steven Rostedt (Google)" <rostedt@xxxxxxxxxxx> > > The ftrace_test_recursion_trylock() also disables preemption. This is not > required, but was a clean up as every place that called it also disabled > preemption, and making the two tightly coupled appeared to make the code > simpler. > > But the recursion protection can be used for other purposes that do not > require disabling preemption. As the recursion bits are attached to the > task_struct, it follows the task, so there's no need for preemption being > disabled. > > Add test_recursion_try_acquire/release() functions to be used generically, > and separate it from being associated with ftrace. It also removes the > "lock" name, as there is no lock happening. Keeping the "lock" for the > ftrace version is better, as it at least differentiates that preemption is > being disabled (hence, "locking the CPU"). > > Link: https://lore.kernel.org/linux-trace-kernel/20230321020103.13494-1-laoar.shao@xxxxxxxxx/ > > Cc: Yafang Shao <laoar.shao@xxxxxxxxx> > Signed-off-by: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx> Acked-by: Yafang Shao <laoar.shao@xxxxxxxxx> +Alexei, bpf Thanks Steven. I almost finished replacing prog->active with test_recursion_try_{acquire,release}[1], and yet I need to do more tests. In my verification, I find that something abnormal happens. When I ran bpf self tests after the replacement, I found the fentry recursion test failed[2]. The test result as follows, main:PASS:skel_open_and_load 0 nsec main:PASS:skel_attach 0 nsec main:PASS:pass1 == 0 0 nsec main:PASS:pass1 == 1 0 nsec main:PASS:pass1 == 2 0 nsec main:PASS:pass2 == 0 0 nsec main:FAIL:pass2 == 1 unexpected pass2 == 1: actual 2 != expected 1 [0] main:FAIL:pass2 == 2 unexpected pass2 == 2: actual 4 != expected 2 main:PASS:get_prog_info 0 nsec main:PASS:recursion_misses 0 nsec The reason is that the bpf_map_delete_elem() in this fentry SEC()[2] will hit the first if-condition[3] but fail to hit the second if-condition[4], so it won't be considered as recursion at the first time. So the value 'pass2' will be 2[0]. Illustrated as follows, SEC("fentry/htab_map_delete_elem") pass2++; <<<< Turn out to be 1 after this operation. bpf_map_delete_elem(&hash2, &key); SEC("fentry/htab_map_delete_elem") <<<< not recursion pass2++; <<<< Turn out to be 2 after this operation. bpf_map_delete_elem(&hash2, &key); SEC("fentry/htab_map_delete_elem") <<<< RECURSION!! That said, if a function can call itself, the first call won't be considered as recursion. Is that expected ? One thing I can be sure of is that prog->active can't handle the interrupted case[5]. If the interrupt happens, it will be considered as a recursion. But it seems that bpf_map_delete_elem() in this fentry SEC() is still in the process context. [1]. https://lore.kernel.org/bpf/CALOAHbACzCwu-VeMczEJw8A4WFgkL-uQDS1NkcVR2pqEMZyAQQ@xxxxxxxxxxxxxx/ [2]. https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/tools/testing/selftests/bpf/progs/recursion.c#n38 [3]. https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/include/linux/trace_recursion.h#n166 [4]. https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/include/linux/trace_recursion.h#n176 [5]. https://lore.kernel.org/bpf/20230409220239.0fcf6738@xxxxxxxxxxxxxxxxxxxx/ > --- > include/linux/trace_recursion.h | 47 ++++++++++++++++++++++++--------- > kernel/trace/ftrace.c | 2 ++ > 2 files changed, 37 insertions(+), 12 deletions(-) > > diff --git a/include/linux/trace_recursion.h b/include/linux/trace_recursion.h > index d48cd92d2364..80de2ee7b4c3 100644 > --- a/include/linux/trace_recursion.h > +++ b/include/linux/trace_recursion.h > @@ -150,9 +150,6 @@ extern void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip); > # define trace_warn_on_no_rcu(ip) false > #endif > > -/* > - * Preemption is promised to be disabled when return bit >= 0. > - */ > static __always_inline int trace_test_and_set_recursion(unsigned long ip, unsigned long pip, > int start) > { > @@ -182,18 +179,11 @@ static __always_inline int trace_test_and_set_recursion(unsigned long ip, unsign > val |= 1 << bit; > current->trace_recursion = val; > barrier(); > - > - preempt_disable_notrace(); > - > return bit; > } > > -/* > - * Preemption will be enabled (if it was previously enabled). > - */ > static __always_inline void trace_clear_recursion(int bit) > { > - preempt_enable_notrace(); > barrier(); > trace_recursion_clear(bit); > } > @@ -205,12 +195,18 @@ static __always_inline void trace_clear_recursion(int bit) > * tracing recursed in the same context (normal vs interrupt), > * > * Returns: -1 if a recursion happened. > - * >= 0 if no recursion. > + * >= 0 if no recursion and preemption will be disabled. > */ > static __always_inline int ftrace_test_recursion_trylock(unsigned long ip, > unsigned long parent_ip) > { > - return trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START); > + int bit; > + > + bit = trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START); > + if (unlikely(bit < 0)) > + return bit; > + preempt_disable_notrace(); > + return bit; > } > > /** > @@ -220,6 +216,33 @@ static __always_inline int ftrace_test_recursion_trylock(unsigned long ip, > * This is used at the end of a ftrace callback. > */ > static __always_inline void ftrace_test_recursion_unlock(int bit) > +{ > + preempt_enable_notrace(); > + trace_clear_recursion(bit); > +} > + > +/** > + * test_recursion_try_acquire - tests for recursion in same context > + * > + * This will detect recursion of a function. > + * > + * Returns: -1 if a recursion happened. > + * >= 0 if no recursion > + */ > +static __always_inline int test_recursion_try_acquire(unsigned long ip, > + unsigned long parent_ip) > +{ > + return trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START); > +} > + > +/** > + * test_recursion_release - called after a success of test_recursion_try_acquire() > + * @bit: The return of a successful test_recursion_try_acquire() > + * > + * This releases the recursion lock taken by a non-negative return call > + * by test_recursion_try_acquire(). > + */ > +static __always_inline void test_recursion_release(int bit) > { > trace_clear_recursion(bit); > } > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c > index db8532a4d5c8..1b117ab057ed 100644 > --- a/kernel/trace/ftrace.c > +++ b/kernel/trace/ftrace.c > @@ -7299,6 +7299,7 @@ __ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip, > if (bit < 0) > return; > > + preempt_disable(); > do_for_each_ftrace_op(op, ftrace_ops_list) { > /* Stub functions don't need to be called nor tested */ > if (op->flags & FTRACE_OPS_FL_STUB) > @@ -7320,6 +7321,7 @@ __ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip, > } > } while_for_each_ftrace_op(op); > out: > + preempt_enable(); > trace_clear_recursion(bit); > } > > -- > 2.39.2 > -- Regards Yafang