On Sat, Apr 15, 2023 at 9:33 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote: > > On Sat, Apr 15, 2023 at 10:17 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > > > From: "Steven Rostedt (Google)" <rostedt@xxxxxxxxxxx> > > > > The ftrace_test_recursion_trylock() also disables preemption. This is not > > required, but was a clean up as every place that called it also disabled > > preemption, and making the two tightly coupled appeared to make the code > > simpler. > > > > But the recursion protection can be used for other purposes that do not > > require disabling preemption. As the recursion bits are attached to the > > task_struct, it follows the task, so there's no need for preemption being > > disabled. > > > > Add test_recursion_try_acquire/release() functions to be used generically, > > and separate it from being associated with ftrace. It also removes the > > "lock" name, as there is no lock happening. Keeping the "lock" for the > > ftrace version is better, as it at least differentiates that preemption is > > being disabled (hence, "locking the CPU"). > > > > Link: https://lore.kernel.org/linux-trace-kernel/20230321020103.13494-1-laoar.shao@xxxxxxxxx/ > > > > Cc: Yafang Shao <laoar.shao@xxxxxxxxx> > > Signed-off-by: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx> > > Acked-by: Yafang Shao <laoar.shao@xxxxxxxxx> > > +Alexei, bpf > > Thanks Steven. > I almost finished replacing prog->active with > test_recursion_try_{acquire,release}[1], and yet I need to do more > tests. > > In my verification, I find that something abnormal happens. When I ran > bpf self tests after the replacement, I found the fentry recursion > test failed[2]. The test result as follows, > > main:PASS:skel_open_and_load 0 nsec > main:PASS:skel_attach 0 nsec > main:PASS:pass1 == 0 0 nsec > main:PASS:pass1 == 1 0 nsec > main:PASS:pass1 == 2 0 nsec > main:PASS:pass2 == 0 0 nsec > main:FAIL:pass2 == 1 unexpected pass2 == 1: actual 2 != expected 1 [0] > main:FAIL:pass2 == 2 unexpected pass2 == 2: actual 4 != expected 2 > main:PASS:get_prog_info 0 nsec > main:PASS:recursion_misses 0 nsec > I don't have a clear idea why TRACE_CTX_TRANSITION must be needed, but it seems we have to do below code for the fentry test case, diff --git a/include/linux/trace_recursion.h b/include/linux/trace_recursion.h index d48cd92d2364..f6b4601dd604 100644 --- a/include/linux/trace_recursion.h +++ b/include/linux/trace_recursion.h @@ -163,21 +163,8 @@ static __always_inline int trace_test_and_set_recursion(unsigned long ip, unsign return -1; bit = trace_get_context_bit() + start; - if (unlikely(val & (1 << bit))) { - /* - * If an interrupt occurs during a trace, and another trace - * happens in that interrupt but before the preempt_count is - * updated to reflect the new interrupt context, then this - * will think a recursion occurred, and the event will be dropped. - * Let a single instance happen via the TRANSITION_BIT to - * not drop those events. - */ - bit = TRACE_CTX_TRANSITION + start; - if (val & (1 << bit)) { - do_ftrace_record_recursion(ip, pip); - return -1; - } - } + if (unlikely(val & (1 << bit))) + return -1; val |= 1 << bit; current->trace_recursion = val; > The reason is that the bpf_map_delete_elem() in this fentry SEC()[2] > will hit the first if-condition[3] but fail to hit the second > if-condition[4], so it won't be considered as recursion at the first > time. So the value 'pass2' will be 2[0]. Illustrated as follows, > > SEC("fentry/htab_map_delete_elem") > pass2++; <<<< Turn out to be 1 after this operation. > bpf_map_delete_elem(&hash2, &key); > SEC("fentry/htab_map_delete_elem") <<<< not recursion > pass2++; <<<< Turn out to be 2 after this operation. > bpf_map_delete_elem(&hash2, &key); > SEC("fentry/htab_map_delete_elem") <<<< RECURSION!! > > That said, if a function can call itself, the first call won't be > considered as recursion. Is that expected ? > > One thing I can be sure of is that prog->active can't handle the > interrupted case[5]. If the interrupt happens, it will be considered > as a recursion. > But it seems that bpf_map_delete_elem() in this fentry SEC() is still > in the process context. > > [1]. https://lore.kernel.org/bpf/CALOAHbACzCwu-VeMczEJw8A4WFgkL-uQDS1NkcVR2pqEMZyAQQ@xxxxxxxxxxxxxx/ > [2]. https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/tools/testing/selftests/bpf/progs/recursion.c#n38 > [3]. https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/include/linux/trace_recursion.h#n166 > [4]. https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/include/linux/trace_recursion.h#n176 > [5]. https://lore.kernel.org/bpf/20230409220239.0fcf6738@xxxxxxxxxxxxxxxxxxxx/ > > > --- > > include/linux/trace_recursion.h | 47 ++++++++++++++++++++++++--------- > > kernel/trace/ftrace.c | 2 ++ > > 2 files changed, 37 insertions(+), 12 deletions(-) > > > > diff --git a/include/linux/trace_recursion.h b/include/linux/trace_recursion.h > > index d48cd92d2364..80de2ee7b4c3 100644 > > --- a/include/linux/trace_recursion.h > > +++ b/include/linux/trace_recursion.h > > @@ -150,9 +150,6 @@ extern void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip); > > # define trace_warn_on_no_rcu(ip) false > > #endif > > > > -/* > > - * Preemption is promised to be disabled when return bit >= 0. > > - */ > > static __always_inline int trace_test_and_set_recursion(unsigned long ip, unsigned long pip, > > int start) > > { > > @@ -182,18 +179,11 @@ static __always_inline int trace_test_and_set_recursion(unsigned long ip, unsign > > val |= 1 << bit; > > current->trace_recursion = val; > > barrier(); > > - > > - preempt_disable_notrace(); > > - > > return bit; > > } > > > > -/* > > - * Preemption will be enabled (if it was previously enabled). > > - */ > > static __always_inline void trace_clear_recursion(int bit) > > { > > - preempt_enable_notrace(); > > barrier(); > > trace_recursion_clear(bit); > > } > > @@ -205,12 +195,18 @@ static __always_inline void trace_clear_recursion(int bit) > > * tracing recursed in the same context (normal vs interrupt), > > * > > * Returns: -1 if a recursion happened. > > - * >= 0 if no recursion. > > + * >= 0 if no recursion and preemption will be disabled. > > */ > > static __always_inline int ftrace_test_recursion_trylock(unsigned long ip, > > unsigned long parent_ip) > > { > > - return trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START); > > + int bit; > > + > > + bit = trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START); > > + if (unlikely(bit < 0)) > > + return bit; > > + preempt_disable_notrace(); > > + return bit; > > } > > > > /** > > @@ -220,6 +216,33 @@ static __always_inline int ftrace_test_recursion_trylock(unsigned long ip, > > * This is used at the end of a ftrace callback. > > */ > > static __always_inline void ftrace_test_recursion_unlock(int bit) > > +{ > > + preempt_enable_notrace(); > > + trace_clear_recursion(bit); > > +} > > + > > +/** > > + * test_recursion_try_acquire - tests for recursion in same context > > + * > > + * This will detect recursion of a function. > > + * > > + * Returns: -1 if a recursion happened. > > + * >= 0 if no recursion > > + */ > > +static __always_inline int test_recursion_try_acquire(unsigned long ip, > > + unsigned long parent_ip) > > +{ > > + return trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START); > > +} > > + > > +/** > > + * test_recursion_release - called after a success of test_recursion_try_acquire() > > + * @bit: The return of a successful test_recursion_try_acquire() > > + * > > + * This releases the recursion lock taken by a non-negative return call > > + * by test_recursion_try_acquire(). > > + */ > > +static __always_inline void test_recursion_release(int bit) > > { > > trace_clear_recursion(bit); > > } > > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c > > index db8532a4d5c8..1b117ab057ed 100644 > > --- a/kernel/trace/ftrace.c > > +++ b/kernel/trace/ftrace.c > > @@ -7299,6 +7299,7 @@ __ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip, > > if (bit < 0) > > return; > > > > + preempt_disable(); > > do_for_each_ftrace_op(op, ftrace_ops_list) { > > /* Stub functions don't need to be called nor tested */ > > if (op->flags & FTRACE_OPS_FL_STUB) > > @@ -7320,6 +7321,7 @@ __ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip, > > } > > } while_for_each_ftrace_op(op); > > out: > > + preempt_enable(); > > trace_clear_recursion(bit); > > } > > > > -- > > 2.39.2 > > > > > -- > Regards > Yafang -- Regards Yafang