On Mon, Nov 11, 2024 at 5:29 PM Sergey Senozhatsky <senozhatsky@xxxxxxxxxxxx> wrote: > > On (24/11/11 09:49), Andrii Nakryiko wrote: > > > On (24/08/29 10:42), Andrii Nakryiko wrote: > > > > Now that build ID related internals in kernel/bpf/stackmap.c can be used > > > > both in sleepable and non-sleepable contexts, we need to add additional > > > > rcu_read_lock()/rcu_read_unlock() protection around fetching > > > > perf_callchain_entry, but with the refactoring in previous commit it's > > > > now pretty straightforward. We make sure to do rcu_read_unlock (in > > > > sleepable mode only) right before stack_map_get_build_id_offset() call > > > > which can sleep. By that time we don't have any more use of > > > > perf_callchain_entry. > > > > > > Shouldn't this be backported to stable kernels? It seems that those still > > > do suspicious-RCU deference: > > > > > > __bpf_get_stack() > > > get_perf_callchain() > > > perf_callchain_user() > > > perf_get_guest_cbs() > > > > Do you see this issue in practice or have some repro? > > __bpf_get_stack() shouldn't be callable from sleepable BPF programs > > until my patch set, so I don't think there is anything to be > > backported. But maybe I'm missing something, which is why I'm asking > > whether this is a conclusion drawn from source code analysis, or there > > was actually a report somewhere. > > I see a syzkaller report (internal) which triggers this call chain > and RCU-usage error. Not sure how practical that is, but syzkaller > was able to hit it (the report I'm looking at is against 5.15, but > __bpf_get_stack()-wise I don't see any differences between 5.15, > 6.1 and 6.6) Hmm.. thinking about this some more, I suspect we do allow bpf_get_stack() and bpf_get_stackid() from sleepable uprobes, so yeah, it's possible to run into this. But for backporting this into older kernels, we'd need to prepare a separate patch that would fix the RCU issue, but wouldn't add sleepable build ID parts.