On Mon, Dec 9, 2024 at 2:45 PM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > Ok, weeding through the perf/uprobe plumbing for BPF, I think we avoid > this problem with uprobe BPF link because uprobe_unregister_sync() > waits for RCU Tasks Trace GP, and so once we finish uprobe > unregistration, we have a guarantee that there is no more uprobe that > might dereference our BPF program. (I might have thought about this > problem when fixing BPF link for sleepable tracepoints, but I missed > the legacy perf_event attach/detach case). > > With legacy perf event perf_event_detach_bpf_prog() we don't do any of > that, we just NULL out pointer and do bpf_prog_put(), not caring > whether this is uprobe, kprobe, or tracepoint... > > So one way to solve this is to either teach > perf_event_detach_bpf_prog() to delay bpf_prog_put() until after RCU > Tasks Trace GP (which is what we do with bpf_prog_array, but not the > program itself), since this is a legacy prog detach api I would just add sync_rcu_tt there. It's a backportable one line change. > or add prog->aux->sleepable_hook flag in addition to > prog->aux->sleepable, and then inside bpf_prog_put() check > (prog->aux->sleepable || prog->aux->sleepable_hook) and do RCU Tasks > Trace delay (in addition to normal call_rcu()). That sounds like more work and if we introduce sleepable_hook we would better generalize it and rely on it everywhere. Which makes it even more work and certainly not for stable. > Third alternative would be to have something like > bpf_prog_put_sleepable() (just like we have > bpf_prog_array_free_sleepable()), where this would do additional > call_rcu_tasks_trace() even if BPF program itself isn't sleepable. Sounds like less work than 2, but conceptually it's the same as 1. Just call_rcu_tt vs sync_rcu_tt. And we can wait just fine in that code path. So I'd go with 1.