I performed a bisection and this issue starts with commit a363d27cdbc2 ("tracing: Allow system call tracepoints to handle page faults") which introduces this change. > + * > + * With @syscall=0, the tracepoint callback array dereference is > + * protected by disabling preemption. > + * With @syscall=1, the tracepoint callback array dereference is > + * protected by Tasks Trace RCU, which allows probes to handle page > + * faults. > */ > #define __DO_TRACE(name, args, cond, syscall) \ > do { \ > @@ -204,11 +212,17 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p) > if (!(cond)) \ > return; \ > \ > - preempt_disable_notrace(); \ > + if (syscall) \ > + rcu_read_lock_trace(); \ > + else \ > + preempt_disable_notrace(); \ > \ > __DO_TRACE_CALL(name, TP_ARGS(args)); \ > \ > - preempt_enable_notrace(); \ > + if (syscall) \ > + rcu_read_unlock_trace(); \ > + else \ > + preempt_enable_notrace(); \ > } while (0) Link: https://lore.kernel.org/bpf/20241009010718.2050182-6-mathieu.desnoyers@xxxxxxxxxxxx/ I reproduced the bug locally by running syz-execprog inside a QEMU VM. > ./syz-execprog -repeat=0 -procs=5 ./repro.syz.txt I /think/ what is happening is that with this change preemption may now occur leading to a scenario where the RCU grace period is insufficient in a few places where call_rcu() is used. In other words, there are a few scenarios where call_rcu_tasks_trace() should be used instead to prevent a use-after-free bug when a preempted tracepoint call tries to access a program, link, etc. that was freed. It seems the syzkaller program induces page faults while attaching raw tracepoints to sys_enter making preemption more likely to occur. kernel/tracepoint.c =================== > ... > static inline void release_probes(struct tracepoint_func *old) > { > ... > call_rcu(&tp_probes->rcu, rcu_free_old_probes); <-- Here > ... > } > ... kernel/bpf/syscall.c ==================== > static void __bpf_prog_put_noref(struct bpf_prog *prog, bool deferred) > { > bpf_prog_kallsyms_del_all(prog); > btf_put(prog->aux->btf); > module_put(prog->aux->mod); > kvfree(prog->aux->jited_linfo); > kvfree(prog->aux->linfo); > kfree(prog->aux->kfunc_tab); > if (prog->aux->attach_btf) > btf_put(prog->aux->attach_btf); > > if (deferred) { > if (prog->sleepable) <------ HERE: New condition needed? > call_rcu_tasks_trace(&prog->aux->rcu, __bpf_prog_put_rcu); > else > call_rcu(&prog->aux->rcu, __bpf_prog_put_rcu); > } else { > __bpf_prog_put_rcu(&prog->aux->rcu); > } > } > > static void bpf_link_free(struct bpf_link *link) > { > const struct bpf_link_ops *ops = link->ops; > bool sleepable = false; > > bpf_link_free_id(link->id); > if (link->prog) { > sleepable = link->prog->sleepable; > /* detach BPF program, clean up used resources */ > ops->release(link); > bpf_prog_put(link->prog); > } > if (ops->dealloc_deferred) { > /* schedule BPF link deallocation; if underlying BPF program > * is sleepable, we need to first wait for RCU tasks trace > * sync, then go through "classic" RCU grace period > */ > if (prog->sleepable) <------ HERE: New condition needed? > call_rcu_tasks_trace(&link->rcu, bpf_link_defer_dealloc_mult_rcu_gp); > else > call_rcu(&link->rcu, bpf_link_defer_dealloc_rcu_gp); > } else if (ops->dealloc) > ops->dealloc(link); > } After patching things locally to ensure that call_rcu_tasks_trace() is always used in these three places I was unable to induce a KASAN bug to occur whereas before it happened pretty much every time I ran ./sys-execprog within a minute or so. I'm a bit unsure about the actual conditions under which call_rcu_tasks_trace() should be used here though. Should there perhaps be another condition such as `preemptable` which is used to determine if call_rcu_tasks_trace() or call_rcu() should be used to free links/programs? Is there any harm in just using call_rcu_tasks_trace() every time in combination with rcu_trace_implies_rcu_gp() like it is in bpf_link_defer_dealloc_mult_rcu_gp()? > static void bpf_link_defer_dealloc_mult_rcu_gp(struct rcu_head *rcu)? > { > if (rcu_trace_implies_rcu_gp()) > bpf_link_defer_dealloc_rcu_gp(rcu); > else > call_rcu(rcu, bpf_link_defer_dealloc_rcu_gp); > } - Jordan