On Sat, Apr 15, 2023 at 9:45 PM Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote: > > > I think the check after bpf_call insn has to be no more than LD + JMP. > > I was thinking whether we can do static_key like patching of the code. > > bpf_throw will know all locations that should be converted from nop into check > > and will do text_poke_bp before throwing. > > Maybe we can consider offline unwind and release too. The verifier will prep > > release tables and throw will execute them. BPF progs always have frame pointers, > > so walking the stack back is relatively easy. Release per callsite is hard. > > > > After some thought, I think offline unwinding is the way to go. That means no > rewrites for the existing code, and we just offload all the cost to the slow > path (bpf_throw call) as it should be. There would be no cost at runtime (except > the conditional branch, which should be well predicted). The current approach > was a bit simpler so I gave it a shot first but I think it's not the way to go. > I will rework the set. It seems so indeed. Offline unwinding is more complex for sure. The challenge is to make it mostly arch independent. Something like get_perf_callchain() followed by lookup IP->release_table and final setjmp() to bpf callback in the last bpf frame. is_bpf_text_address() will help. We shouldn't gate it by HAVE_RELIABLE_STACKTRACE, since bpf prog stack walking is reliable on all archs where JIT is enabled. Unwinding won't work reliably in interpreted mode though and it's ok. bpf_throw is a kfunc and it needs prog->jit_requested anyway.