On Fri, Nov 13, 2020 at 05:48:31PM -0800, Alexei Starovoitov wrote: > On Fri, Nov 13, 2020 at 12:40 AM Gary Lin <glin@xxxxxxxx> wrote: > > > > The x64 bpf jit expects the bpf images converge within the given passes. > > However there is a corner case: > > > > l0: ldh [4] > > l1: jeq #0x537d, l2, l40 > > l2: ld [0] > > l3: jeq #0xfa163e0d, l4, l40 > > l4: ldh [12] > > l5: ldx #0xe > > l6: jeq #0x86dd, l41, l7 > > l7: jeq #0x800, l8, l41 > > l8: ld [x+16] > > l9: ja 41 > > > > [... repeated ja 41 ] > > > > l40: ja 41 > > l41: ret #0 > > l42: ld #len > > l43: ret a > > > > The bpf program contains 32 "ja 41" and do_jit() only removes one "ja 41" > > right before "l41: ret #0" for offset==0 in each pass, so > > bpf_int_jit_compile() needs to run do_jit() at least 32 times to > > eliminate those JMP instructions. Since the current max number of passes > > is 20, the bpf program couldn't converge within 20 passes and got rejected > > when BPF_JIT_ALWAYS_ON is set even though it's legit as a classic socket > > filter. > > > > A not-converged image may be not optimal but at least the bpf > > instructions are translated into x64 machine code. Maybe we could just > > issue a warning instead so that the program is still loaded and the user > > is also notified. > > Non-convergence is not about being optimal. It's about correctness. > If size is different it likely means that at least one jump has the > wrong offset. > Ah, I see. > Bumping from 20 to 64 also won't solve it. > There could be a case where 64 isn't enough either. True. Increasing the number of passes is just a workaround. > One of the test_bpf.ko tests can hit any limit, iirc. Thanks for the pointer. Will look into the tests. > > Also we've seen a case where JIT might never converge. > The iteration N can have size 40, iteration N+1 size 38, iteration N+2 size 40 > and keep oscillating like this. > > I think the fix could be is to avoid optimality in size when pass > number is getting large. > Like after pass > 10 BPF_JA could always use 32-bit offset regardless > of actual addrs[i + insn->off] - addrs[i]; difference. > There could be other solutions too. > So the size convergence can be ignored if all BPF_JMPs are translated into 32-bit offset jmp? Thanks, Gary Lin