Actual tail call count limits in x86 JITs and interpreter

Johan Almbladh <johan.almbladh@xxxxxxxxxxxxxxxxx> · Fri, 10 Sep 2021 00:15:57 +0200

I have done some investigation into this matter, and this is what I
have found. To recap, the situation is as follows.

MAX_TAIL_CALL_CNT is defined to be 32. Since the interpreter has used
33 instead, we agree to use that limit across all JIT implementations
to not break any user space program. To make sure everything uses the
same limit, we must first understand what the current state actually
is so we know what to fix.

Me: according to test_bpf.ko the tail call limit is 33 for the
interpreter, and 32 for the x86-64 JIT.
Paul: according to selftests the tail call limit is 33 for both the
interpreter and the x86-64 JIT.

Link: https://lore.kernel.org/bpf/20210809093437.876558-1-johan.almbladh@xxxxxxxxxxxxxxxxx/

I have been able to reproduce the above selftests results using
vmtest.sh. Digging deeper into this, I found that there are actually
two different code paths where the tail call count is checked in the
x86-64 JIT, corresponding to direct and indirect tail calls. By
setting different limits in those two places, I found that selftests
tailcall_3 hits the limit in emit_bpf_tail_call_direct(), whereas the
test_bpf.ko is limited by emit_bpf_tail_call_indirect().

I am not 100% sure that this is the correct explanation, but it sounds
very reasonable. However, the asm generated in the two cases look very
similar to me, so by looking at that alone I cannot really see that
the limits would be different. Perhaps someone more versed in x86 asm
could take a closer look.

What are your thoughts?

Johan