Re: [PATCH bpf-next v1 07/10] bpf: Ensure IP is within prog->jited_length for bpf_throw calls

Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> · Tue, 18 Jul 2023 00:39:31 +0530

On Mon, 17 Jul 2023 at 23:15, Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Mon, Jul 17, 2023 at 9:36 AM Kumar Kartikeya Dwivedi
> <memxor@xxxxxxxxx> wrote:
> >
> > On Sat, 15 Jul 2023 at 04:09, Alexei Starovoitov
> > <alexei.starovoitov@xxxxxxxxx> wrote:
> > >
> > > On Thu, Jul 13, 2023 at 08:02:29AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > > Now that we allow exception throwing using bpf_throw kfunc, it can
> > > > appear as the final instruction in a prog. When this happens, and we
> > > > begin to unwind the stack using arch_bpf_stack_walk, the instruction
> > > > pointer (IP) may appear to lie outside the JITed instructions. This
> > > > happens because the return address is the instruction following the
> > > > call, but the bpf_throw never returns to the program, so the JIT
> > > > considers instruction ending at the bpf_throw call as the final JITed
> > > > instruction and end of the jited_length for the program.
> > > >
> > > > This becomes a problem when we search the IP using is_bpf_text_address
> > > > and bpf_prog_ksym_find, both of which use bpf_ksym_find under the hood,
> > > > and it rightfully considers addr == ksym.end to be outside the program's
> > > > boundaries.
> > > >
> > > > Insert a dummy 'int3' instruction which will never be hit to bump the
> > > > jited_length and allow us to handle programs with their final
> > > > isntruction being a call to bpf_throw.
> > > >
> > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx>
> > > > ---
> > > >  arch/x86/net/bpf_jit_comp.c | 11 +++++++++++
> > > >  include/linux/bpf.h         |  2 ++
> > > >  2 files changed, 13 insertions(+)
> > > >
> > > > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > > > index 8d97c6a60f9a..052230cc7f50 100644
> > > > --- a/arch/x86/net/bpf_jit_comp.c
> > > > +++ b/arch/x86/net/bpf_jit_comp.c
> > > > @@ -1579,6 +1579,17 @@ st:                    if (is_imm8(insn->off))
> > > >                       }
> > > >                       if (emit_call(&prog, func, image + addrs[i - 1] + offs))
> > > >                               return -EINVAL;
> > > > +                     /* Similar to BPF_EXIT_INSN, call for bpf_throw may be
> > > > +                      * the final instruction in the program. Insert an int3
> > > > +                      * following the call instruction so that we can still
> > > > +                      * detect pc to be part of the bpf_prog in
> > > > +                      * bpf_ksym_find, otherwise when this is the last
> > > > +                      * instruction (as allowed by verifier, similar to exit
> > > > +                      * and jump instructions), pc will be == ksym.end,
> > > > +                      * leading to bpf_throw failing to unwind the stack.
> > > > +                      */
> > > > +                     if (func == (u8 *)&bpf_throw)
> > > > +                             EMIT1(0xCC); /* int3 */
> > >
> > > Probably worth explaining that this happens because bpf_throw is marked
> > > __attribute__((noreturn)) and compiler can emit it last without BPF_EXIT insn.
> > > Meaing the program might not have BPF_EXIT at all.
> >
> > Yes, sorry about omitting that. I will add it to the commit message in v2.
> >
> > >
> > > I wonder though whether this self-inflicted pain is worth it.
> > > May be it shouldn't be marked as noreturn.
> > > What do we gain by marking?
> >
> > It felt like the obvious thing to do to me. The cost on the kernel
> > side is negligible (atleast in my opinion), we just have to allow it
> > as final instruction in the program. If it's heavily used it allows
> > the compiler to better optimize the code (marking anything after it
> > unreachable, no need to save registers etc., although this may not be
> > a persuasive point for you).
>
> "no need to save registers"... "optimize"... that's the thing that worries me.
> I think it's better to drop noreturn attribute.
> bpf has implicit prolog/epilogue that only apply to bpf_exit insn.
> bpf_call insn that doesn't return is exploiting undefined logic
> in the compiler, since we never fully clarified our hidden prologue/epilogue
> rules. Saying it differently, bpf_tail_call is also noreturn,
> but if we mark it as such all kinds of things will break.
> We still need to add alloca(). It doesn't play well with the current BPF ISA.
> I think it's better to treat 'noreturn' as broken in the compiler,
> since its behavior may change.
>

Ok, I think then let's drop this patch and the noreturn attribute on bpf_throw.

> > Regardless of this noreturn attribute, I was thinking whether we
> > should always emit an extra instruction so that any IP (say one past
> > last instruction) we get for a BPF prog can always be seen as
> > belonging to it. It probably is only a problem surfaced by this
> > bpf_throw call at the end, but I was wondering whether doing it
> > unconditionally makes sense.
>
> I think it's a corner case of this 'noreturn' from bpf_call logic.
> The bpf prog is padded with 0xcc before and after already.
> What you're suggesting is to add one of 0xcc to the body of the prog.
> That doesn't sound right.

Actually this patch was added because I caught the case where the
reported ip during unwinding was lying outside jited_length (==
ksym.end).
So including an extra instruction would prevent that. But it's true
it's a side effect of the noreturn attribute, it wouldn't occur
otherwise.

Let's drop this patch then.