On Tue, Mar 23, 2021 at 07:50:55AM -0700, Alexei Starovoitov wrote: > On 3/23/21 5:59 AM, Steven Rostedt wrote: > > On Mon, 22 Mar 2021 19:53:10 +0100 > > Jiri Olsa <jolsa@xxxxxxxxxx> wrote: > > > > > On Mon, Mar 22, 2021 at 05:24:26PM +0100, Jiri Olsa wrote: > > > > On Mon, Mar 22, 2021 at 12:32:05AM +0100, Jiri Olsa wrote: > > > > > On Tue, Mar 16, 2021 at 02:00:07PM -0700, Alexei Starovoitov wrote: > > > > > > From: Alexei Starovoitov <ast@xxxxxxxxxx> > > > > > > > > > > > > The fexit/fmod_ret programs can be attached to kernel functions that can sleep. > > > > > > The synchronize_rcu_tasks() will not wait for such tasks to complete. > > > > > > In such case the trampoline image will be freed and when the task > > > > > > wakes up the return IP will point to freed memory causing the crash. > > > > > > Solve this by adding percpu_ref_get/put for the duration of trampoline > > > > > > and separate trampoline vs its image life times. > > > > > > The "half page" optimization has to be removed, since > > > > > > first_half->second_half->first_half transition cannot be guaranteed to > > > > > > complete in deterministic time. Every trampoline update becomes a new image. > > > > > > The image with fmod_ret or fexit progs will be freed via percpu_ref_kill and > > > > > > call_rcu_tasks. Together they will wait for the original function and > > > > > > trampoline asm to complete. The trampoline is patched from nop to jmp to skip > > > > > > fexit progs. They are freed independently from the trampoline. The image with > > > > > > fentry progs only will be freed via call_rcu_tasks_trace+call_rcu_tasks which > > > > > > will wait for both sleepable and non-sleepable progs to complete. > > > > > > > > > > > > Reported-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > > > > > > Fixes: fec56f5890d9 ("bpf: Introduce BPF trampoline") > > > > > > Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx> > > > > > > Acked-by: Paul E. McKenney <paulmck@xxxxxxxxxx> # for RCU > > > > > > --- > > > > > > Without ftrace fix: > > > > > > https://patchwork.kernel.org/project/netdevbpf/patch/20210316195815.34714-1-alexei.starovoitov@xxxxxxxxx/ > > > > > > this patch will trigger warn in ftrace. > > > > > > > > > > > > arch/x86/net/bpf_jit_comp.c | 26 ++++- > > > > > > include/linux/bpf.h | 24 +++- > > > > > > kernel/bpf/bpf_struct_ops.c | 2 +- > > > > > > kernel/bpf/core.c | 4 +- > > > > > > kernel/bpf/trampoline.c | 218 +++++++++++++++++++++++++++--------- > > > > > > 5 files changed, 213 insertions(+), 61 deletions(-) > > > > > > > > > > hi, > > > > > I'm on bpf/master and I'm triggering warnings below when running together: > > > > > > > > > > # while :; do ./test_progs -t fentry_test ; done > > > > > # while :; do ./test_progs -t module_attach ; done > > > > > > > > hum, is it possible that we don't take module ref and it can get > > > > unloaded even if there's trampoline attach to it..? I can't see > > > > that in the code.. ftrace_release_mod can't fail ;-) > > > > > > when I get the module for each module trampoline, > > > I can no longer see those warnings (link for Steven): > > > https://lore.kernel.org/bpf/YFfXcqnksPsSe0Bv@krava/ > > > > > > Steven, > > > I might be missing something, but it looks like module > > > can be unloaded even if the trampoline (direct function) > > > is registered in it.. is that right? > > > > > > > Not with your patch below ;-) > > > > But yes, ftrace does not currently manage module text for direct calls, > > it's assumed that whoever attaches to the module text would do that. But > > I'm not adverse to the patch below. > > Jiri, > > could you please refactor your patch to do the same in bpf trampoline? right, we need to take care about bpf_arch_text_poke interface as well.. I'll resend > The selftest/bpf would be great as well. It can come as a follow up. > Let's fix the issue for bpf tree first. ok, will send test later jirka