Re: [PATCH bpf-next 2/4] bpf, x64: Fix tailcall hierarchy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jiri Olsa wrote:
> On Thu, Jan 04, 2024 at 08:15:36PM -0800, Alexei Starovoitov wrote:
> > On Thu, Jan 4, 2024 at 6:23 AM Leon Hwang <hffilwlqm@xxxxxxxxx> wrote:
> > >
> > >
> > > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > > index fe30b9ebb8de4..67fa337fc2e0c 100644
> > > --- a/arch/x86/net/bpf_jit_comp.c
> > > +++ b/arch/x86/net/bpf_jit_comp.c
> > > @@ -259,7 +259,7 @@ struct jit_context {
> > >  /* Number of bytes emit_patch() needs to generate instructions */
> > >  #define X86_PATCH_SIZE         5
> > >  /* Number of bytes that will be skipped on tailcall */
> > > -#define X86_TAIL_CALL_OFFSET   (11 + ENDBR_INSN_SIZE)
> > > +#define X86_TAIL_CALL_OFFSET   (22 + ENDBR_INSN_SIZE)
> > >
> > >  static void push_r12(u8 **pprog)
> > >  {
> > > @@ -406,14 +406,21 @@ static void emit_prologue(u8 **pprog, u32 stack_depth, bool ebpf_from_cbpf,
> > >          */
> > >         emit_nops(&prog, X86_PATCH_SIZE);
> > >         if (!ebpf_from_cbpf) {
> > > -               if (tail_call_reachable && !is_subprog)
> > > +               if (tail_call_reachable && !is_subprog) {
> > >                         /* When it's the entry of the whole tailcall context,
> > >                          * zeroing rax means initialising tail_call_cnt.
> > >                          */
> > > -                       EMIT2(0x31, 0xC0); /* xor eax, eax */
> > > -               else
> > > -                       /* Keep the same instruction layout. */
> > > -                       EMIT2(0x66, 0x90); /* nop2 */
> > > +                       EMIT2(0x31, 0xC0);       /* xor eax, eax */
> > > +                       EMIT1(0x50);             /* push rax */
> > > +                       /* Make rax as ptr that points to tail_call_cnt. */
> > > +                       EMIT3(0x48, 0x89, 0xE0); /* mov rax, rsp */
> > > +                       EMIT1_off32(0xE8, 2);    /* call main prog */
> > > +                       EMIT1(0x59);             /* pop rcx, get rid of tail_call_cnt */
> > > +                       EMIT1(0xC3);             /* ret */
> > > +               } else {
> > > +                       /* Keep the same instruction size. */
> > > +                       emit_nops(&prog, 13);
> > > +               }
> > 
> > I'm afraid the extra call breaks stack unwinding and many other things.
> > The proper frame needs to be setup (push rbp; etc)
> > and 'leave' + emit_return() is used.
> > Plain 'ret' is not ok.
> > x86_call_depth_emit_accounting() needs to be used too.
> > That will make X86_TAIL_CALL_OFFSET adjustment very complicated.
> > Also the fix doesn't address the stack size issue.
> > We shouldn't allow all the extra frames at run-time.
> > 
> > The tail_cnt_ptr approach is interesting but too heavy,
> > since arm64, s390 and other JITs would need to repeat it with equally
> > complicated calculations in TAIL_CALL_OFFSET.
> > 
> > The fix should really be thought through for all JITs. Not just x86.
> > 
> > I'm thinking whether we should do the following instead:
> > 
> > diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
> > index 0bdbbbeab155..0b45571559be 100644
> > --- a/kernel/bpf/arraymap.c
> > +++ b/kernel/bpf/arraymap.c
> > @@ -910,7 +910,7 @@ static void *prog_fd_array_get_ptr(struct bpf_map *map,
> >         if (IS_ERR(prog))
> >                 return prog;
> > 
> > -       if (!bpf_prog_map_compatible(map, prog)) {
> > +       if (!bpf_prog_map_compatible(map, prog) || prog->aux->func_cnt) {
> >                 bpf_prog_put(prog);
> >                 return ERR_PTR(-EINVAL);
> >         }
> > 
> > This will stop stack growth, but it will break a few existing tests.
> > I feel it's a price worth paying.
> > 
> > John, Daniel,
> > 
> > do you see anything breaking on cilium side if we disallow
> > progs with subprogs to be inserted in prog_array ?
> 
> FWIW tetragon should be ok with this.. we use few subprograms in
> hubble, but most of them are not called from tail called programs

We actually do this in some of the l7 parsers where we try to use
subprogs as much as possible but still use prog_array for calls
that might end up being recursive.

I'll think about it Monday my first thought is some refactoring and
using bpf_loop or other helpers could resolve this.

If its just bpf-next and not backported onto stable we can probably
figure something out.

> 
> jirka
> 
> > 
> > Other alternatives?
> > 
> 




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux