On Mon, 17 Jul 2023 at 23:15, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Mon, Jul 17, 2023 at 9:36 AM Kumar Kartikeya Dwivedi > <memxor@xxxxxxxxx> wrote: > > > > On Sat, 15 Jul 2023 at 04:09, Alexei Starovoitov > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > On Thu, Jul 13, 2023 at 08:02:29AM +0530, Kumar Kartikeya Dwivedi wrote: > > > > Now that we allow exception throwing using bpf_throw kfunc, it can > > > > appear as the final instruction in a prog. When this happens, and we > > > > begin to unwind the stack using arch_bpf_stack_walk, the instruction > > > > pointer (IP) may appear to lie outside the JITed instructions. This > > > > happens because the return address is the instruction following the > > > > call, but the bpf_throw never returns to the program, so the JIT > > > > considers instruction ending at the bpf_throw call as the final JITed > > > > instruction and end of the jited_length for the program. > > > > > > > > This becomes a problem when we search the IP using is_bpf_text_address > > > > and bpf_prog_ksym_find, both of which use bpf_ksym_find under the hood, > > > > and it rightfully considers addr == ksym.end to be outside the program's > > > > boundaries. > > > > > > > > Insert a dummy 'int3' instruction which will never be hit to bump the > > > > jited_length and allow us to handle programs with their final > > > > isntruction being a call to bpf_throw. > > > > > > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> > > > > --- > > > > arch/x86/net/bpf_jit_comp.c | 11 +++++++++++ > > > > include/linux/bpf.h | 2 ++ > > > > 2 files changed, 13 insertions(+) > > > > > > > > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c > > > > index 8d97c6a60f9a..052230cc7f50 100644 > > > > --- a/arch/x86/net/bpf_jit_comp.c > > > > +++ b/arch/x86/net/bpf_jit_comp.c > > > > @@ -1579,6 +1579,17 @@ st: if (is_imm8(insn->off)) > > > > } > > > > if (emit_call(&prog, func, image + addrs[i - 1] + offs)) > > > > return -EINVAL; > > > > + /* Similar to BPF_EXIT_INSN, call for bpf_throw may be > > > > + * the final instruction in the program. Insert an int3 > > > > + * following the call instruction so that we can still > > > > + * detect pc to be part of the bpf_prog in > > > > + * bpf_ksym_find, otherwise when this is the last > > > > + * instruction (as allowed by verifier, similar to exit > > > > + * and jump instructions), pc will be == ksym.end, > > > > + * leading to bpf_throw failing to unwind the stack. > > > > + */ > > > > + if (func == (u8 *)&bpf_throw) > > > > + EMIT1(0xCC); /* int3 */ > > > > > > Probably worth explaining that this happens because bpf_throw is marked > > > __attribute__((noreturn)) and compiler can emit it last without BPF_EXIT insn. > > > Meaing the program might not have BPF_EXIT at all. > > > > Yes, sorry about omitting that. I will add it to the commit message in v2. > > > > > > > > I wonder though whether this self-inflicted pain is worth it. > > > May be it shouldn't be marked as noreturn. > > > What do we gain by marking? > > > > It felt like the obvious thing to do to me. The cost on the kernel > > side is negligible (atleast in my opinion), we just have to allow it > > as final instruction in the program. If it's heavily used it allows > > the compiler to better optimize the code (marking anything after it > > unreachable, no need to save registers etc., although this may not be > > a persuasive point for you). > > "no need to save registers"... "optimize"... that's the thing that worries me. > I think it's better to drop noreturn attribute. > bpf has implicit prolog/epilogue that only apply to bpf_exit insn. > bpf_call insn that doesn't return is exploiting undefined logic > in the compiler, since we never fully clarified our hidden prologue/epilogue > rules. Saying it differently, bpf_tail_call is also noreturn, > but if we mark it as such all kinds of things will break. > We still need to add alloca(). It doesn't play well with the current BPF ISA. > I think it's better to treat 'noreturn' as broken in the compiler, > since its behavior may change. > Ok, I think then let's drop this patch and the noreturn attribute on bpf_throw. > > Regardless of this noreturn attribute, I was thinking whether we > > should always emit an extra instruction so that any IP (say one past > > last instruction) we get for a BPF prog can always be seen as > > belonging to it. It probably is only a problem surfaced by this > > bpf_throw call at the end, but I was wondering whether doing it > > unconditionally makes sense. > > I think it's a corner case of this 'noreturn' from bpf_call logic. > The bpf prog is padded with 0xcc before and after already. > What you're suggesting is to add one of 0xcc to the body of the prog. > That doesn't sound right. Actually this patch was added because I caught the case where the reported ip during unwinding was lying outside jited_length (== ksym.end). So including an extra instruction would prevent that. But it's true it's a side effect of the noreturn attribute, it wouldn't occur otherwise. Let's drop this patch then.