Exactly, and rethook_trampoline_handler suffers the same problem. And I've posted two patches for kprobe and rethook by using the notrace verison of preempt_ {disable, enable} to fix fprobe+rethook. [1] https://lore.kernel.org/all/20230513081656.375846-1-zegao@xxxxxxxxxxx/T/#u [2] https://lore.kernel.org/all/20230513090548.376522-1-zegao@xxxxxxxxxxx/T/#u Even worse, bpf callback introduces more such use cases, which is typically organized as follows to guard the lifetime of bpf related resources ( per-cpu access or trampoline). migrate_disable() rcu_read_lock() ... bpf_prog_run() ... rcu_read_unlock() migrate_enable(). But this may need to introduce fprobe_blacklist and bpf_kprobe_blacklist to solve such bugs at all, just like what Jiri and Yonghong suggested. Since bpf kprobe works on a different (higher and constrained) level than fprobe and ftrace and we cannot blindly mark functions (migrate_disable, __rcu_read_lock, etc.) used in tracer callbacks from external subsystems in case of semantic breakage. And I will try to implement these ideas later. Thanks, Ze On Sat, May 13, 2023 at 12:18 PM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > On Fri, 12 May 2023 07:29:02 -0700 > Yonghong Song <yhs@xxxxxxxx> wrote: > > > A fprobe_blacklist might make sense indeed as fprobe and kprobe are > > quite different... Thanks for working on this. > > Hmm, I think I see the problem: > > fprobe_kprobe_handler() { > kprobe_busy_begin() { > preempt_disable() { > preempt_count_add() { <-- trace > fprobe_kprobe_handler() { > [ wash, rinse, repeat, CRASH!!! ] > > Either the kprobe_busy_begin() needs to use preempt_disable_notrace() > versions, or fprobe_kprobe_handle() needs a > ftrace_test_recursion_trylock() call. > > -- Steve