32-bit zext time complexity (Was Re: [PATCH bpf-next] selftests/bpf: two scale tests)

Jiong Wang <jiong.wang@xxxxxxxxxxxxx> · Thu, 25 Apr 2019 00:07:06 +0100

Alexei Starovoitov writes:

> Add two tests to check that sequence of 1024 jumps is verifiable.
>
> Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx>
> ---
>  tools/testing/selftests/bpf/test_verifier.c  | 70 ++++++++++++++++++++
>  tools/testing/selftests/bpf/verifier/scale.c | 18 +++++

I am rebasing 32-bit opt pass on top of latest bpf-next and found these new
tests take more than 20 minutes to run and had not finished after that.

The reason the following insn filling insde bpf_fill_scale1 is generating
nearly 1M insn whose results are recognized as safe to be poisoned.

  bpf_fill_scale1:
    while (i < MAX_TEST_INSNS - 1025)
      insn[i++] = BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 42);

For each hi32 poisoning, there will be one call to "bpf_patch_insn_data"
which actually is not cheap (adjust jump insns, insn aux info etc). Now,
1M call to it has exhausted server resources as described, 20minutes running
still not finished.

For real world applications, we don't do hi32 poisoning, and there isn't much
lo32 zext. Benchmarking those bpf programs inside Cilium shows the final
zext pass adds about 8% ~ 15% verification time.

The zext pass based on top of "bpf_patch_insn_data" looks more and more is
not the best approach to utilize the read32 analysis results.

Previously, in v1 cover letter, I listed some of my other thoughts on how to
utilize the liveness analysis results:

   1 Minor change on back-end JIT hook, also pass aux_insn information to
     back-ends so they could have per insn information and they could do
     zero extension for the marked insn themselves using the most
     efficient native insn.

   2 Introduce zero extension insn for eBPF. Then verifier could insert
     the new zext insn instead of lshift + rshift. zext could be JITed
     more efficiently.

   3 Otherwise JIT back-ends need to do peephole to catch lshift + rshift
     and turn them into native zext.

More thinking on this, perhaps we should just go with approach 1 which
actually doesn't require much change I guess.

There actually is no need to change back-end JIT hook. After we finished
liveness analysis, just copy insn zext info from env->aux_insn_data.zext_dst
to new dynamic allocated storage inside something like
"bpf_prog->aux->insn_zext_dst" (a bool * pointer, with 1 * prog_len storage).
Such optimization information then is available to JIT back-end automatically,
because the currect JIT hook is 

  struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)

After bpf_int_jit_compile finished, the dynamic allocated storage could be
freed, no need to keep it along with bpf_prog's life time.

Backends also have finer control on how to do zero extension so concerns like
https://www.spinics.net/lists/netdev/msg564489.html could be addressed
naturally and no need to introduce new BPF zero-extension instruction.

I am intended to send out the updated version using approach 1 so we could
see how it looks like, comments?

Regards,
Jiong