Re: [PATCH bpf-next 0/7] Fix MAX_TAIL_CALL_CNT handling in eBPF JITs

Johan Almbladh <johan.almbladh@xxxxxxxxxxxxxxxxx> · Mon, 16 Aug 2021 09:17:55 +0200

On Thu, Aug 12, 2021 at 6:37 PM Paul Chaignon <paul.chaignon@xxxxxxxxx> wrote:
> On Mon, Aug 09, 2021 at 11:34:30AM +0200, Johan Almbladh wrote:
> > A new test of tail call count limiting revealed that the interpreter
> > did in fact allow up to MAX_TAIL_CALL_CNT + 1 tail calls, whereas the
> > x86 JITs stopped at the intended MAX_TAIL_CALL_CNT. The interpreter was
> > fixed in commit b61a28cf11d61f512172e673b8f8c4a6c789b425 ("bpf: Fix
> > off-by-one in tail call count limiting"). This patch set fixes all
> > arch-specific JITs except for RISC-V.
>
> I'm a bit surprised by this because I had previously tested the tail
> call limit of several JIT compilers and found it to be 33 (i.e.,
> allowing chains of up to 34 programs). I've just extended a test program
> I had to validate this again on the x86-64 JIT and found a limit of 33
> tail calls again [1].

Hmm, that was surprising. I have been working on a MIPS32 JIT, and as
a part of that I have been extending the in-kernel test suite in
lib/test_bpf.c. The additional tests include a suite for testing tail
calls and associated error paths. The tests were merged to bpf-next
[1].

The tail call limit test is a very simple BPF program that increments
R1, sets R0 to R1, and then calls itself again with a tail call. Since
the program is called with R1=0, the return value R0 will then be 1 +
number of tail calls executed. When I ran this on x86 I got the
following result.

Interpreter: 34
x86_64 JIT: 33
i386 JIT: 33

So, the interpreter and the x86 JITs had different behaviours. It was
then decided to change the interpreter to allow 32 tail calls to match
the behaviour of the x86 JITs [2]. As a follow up on that, I tested
the other JITs except RISC-V in the same way, and found that they too
allowed one more tail call than the now-updated [3] interpreter. This
patch set updates the behaviour of those JITs as well.

[1] https://lore.kernel.org/bpf/20210809091829.810076-1-johan.almbladh@xxxxxxxxxxxxxxxxx/
[2] https://lore.kernel.org/bpf/5afe26c6-7ab1-88ab-a3e0-eb007256a856@xxxxxxxxxxxxx/
[3] b61a28cf1 ("bpf: Fix off-by-one in tail call count limiting")

> Also note we had previously changed the RISC-V and MIPS JITs to allow up
> to 33 tail calls [2, 3], for consistency with other JITs and with the
> interpreter. We had decided to increase these two to 33 rather than
> decrease the other JITs to 32 for backward compatibility, though that
> probably doesn't matter much as I'd expect few people to actually use 33
> tail calls :-)

Right, the backwards compatibility aspect is a valid point. I don't
think anyone would be near that limit though, :-) but still.

Whether the limit is 32 or 33 really doesn't matter. My only concern
here is that the limit should be the same across all JIT
implementations and the interpreter. We could instead change the x86
JITs and revert the interpreter change to let the limit be 33, if that
would be a better solution.

> 1 - https://github.com/pchaigno/tail-call-bench/commit/ae7887482985b4b1745c9b2ef7ff9ae506c82886
> 2 - 96bc4432 ("bpf, riscv: Limit to 33 tail calls")
> 3 - e49e6f6d ("bpf, mips: Limit to 33 tail calls")
>
> >
> > For each of the affected JITs, the incorrect behaviour was verified
> > by running the test_bpf test suite in QEMU. After the fixes, the JITs
> > pass the tail call count limiting test.
>
> If you are referring to test_tailcall_3 and its associated BPF program
> tailcall3, then as far as I can tell, it checks that 33 tail calls are
> allowed. The counter is incremented before each tail call except the
> first one. The last tail call is rejected because we reach the limit, so
> a counter value of 33 (as checked in the test code) means we've
> successfully executed 33 tail calls.

My test setup can build for all architectures included in this patch
set and some more, and then boot the kernel in QEMU with a
statically-linked busybox as userspace. I can easily run the kernel's
BPF test suite on all those architectures, but since I don't have a
full-fledged userspace I have not been able to run the selftests in
the same way.

We need to be able to determine what the tail call limit actually is
for the different implementations. I don't understand why you get
different results when testing from userspace compared to testing the
JIT itself. Either one of the tests is faulty, or there is some other
mechanism at play here.

Johan

> >
> > I have not been able to test the RISC-V JITs due to the lack of a
> > working toolchain and QEMU setup. It is likely that the RISC-V JITs
> > have the off-by-one behaviour too. I have not verfied any of the NIC JITs.
> >
> > Link: https://lore.kernel.org/bpf/20210728164741.350370-1-johan.almbladh@xxxxxxxxxxxxxxxxx/
> >
> > Johan Almbladh (7):
> >   arm: bpf: Fix off-by-one in tail call count limiting
> >   arm64: bpf: Fix off-by-one in tail call count limiting
> >   powerpc: bpf: Fix off-by-one in tail call count limiting
> >   s390: bpf: Fix off-by-one in tail call count limiting
> >   sparc: bpf: Fix off-by-one in tail call count limiting
> >   mips: bpf: Fix off-by-one in tail call count limiting
> >   x86: bpf: Fix comments on tail call count limiting
> >
> >  arch/arm/net/bpf_jit_32.c         | 6 +++---
> >  arch/arm64/net/bpf_jit_comp.c     | 4 ++--
> >  arch/mips/net/ebpf_jit.c          | 4 ++--
> >  arch/powerpc/net/bpf_jit_comp32.c | 4 ++--
> >  arch/powerpc/net/bpf_jit_comp64.c | 4 ++--
> >  arch/s390/net/bpf_jit_comp.c      | 6 +++---
> >  arch/sparc/net/bpf_jit_comp_64.c  | 2 +-
> >  arch/x86/net/bpf_jit_comp32.c     | 6 +++---
> >  8 files changed, 18 insertions(+), 18 deletions(-)
> >
> > --
> > 2.25.1
> >