I have done some investigation into this matter, and this is what I have found. To recap, the situation is as follows. MAX_TAIL_CALL_CNT is defined to be 32. Since the interpreter has used 33 instead, we agree to use that limit across all JIT implementations to not break any user space program. To make sure everything uses the same limit, we must first understand what the current state actually is so we know what to fix. Me: according to test_bpf.ko the tail call limit is 33 for the interpreter, and 32 for the x86-64 JIT. Paul: according to selftests the tail call limit is 33 for both the interpreter and the x86-64 JIT. Link: https://lore.kernel.org/bpf/20210809093437.876558-1-johan.almbladh@xxxxxxxxxxxxxxxxx/ I have been able to reproduce the above selftests results using vmtest.sh. Digging deeper into this, I found that there are actually two different code paths where the tail call count is checked in the x86-64 JIT, corresponding to direct and indirect tail calls. By setting different limits in those two places, I found that selftests tailcall_3 hits the limit in emit_bpf_tail_call_direct(), whereas the test_bpf.ko is limited by emit_bpf_tail_call_indirect(). I am not 100% sure that this is the correct explanation, but it sounds very reasonable. However, the asm generated in the two cases look very similar to me, so by looking at that alone I cannot really see that the limits would be different. Perhaps someone more versed in x86 asm could take a closer look. What are your thoughts? Johan