On Thu, 4 Nov 2021 at 16:51, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > Thanks for flagging! > Could you craft a test case that we can use a repro and future > test case? Yes, I'll give it a shot. > > fp-88=map_value fp-96=mmmmmmmm fp-104=map_value fp-112=inv fp-120=fp > ... > > I've bisected the problem to commit 3e8ce29850f1 ("bpf: Prevent > > pointer mismatch in bpf_timer_init.") The commit seems unrelated to > > loop processing though (it does touch the verifier however). Either I > > got the bisection wrong or there is something subtle going on. > > I stared at that commit and the example asm. > I suspect the bisect went wrong. I tried the parent of the offending commit, and it worked fine. Weird. Could the problem be that there are multiple regressions? See below, we also get hit with the corrupted stack spill. > Could you try reverting a single > commit 354e8f1970f8 ("bpf: Support <8-byte scalar spill and refill") > ? > The above fp-112=inv means that the verifier is tracking scalar spill. > That could be the reason for bounded loop logic seeing different > stack state on every iteration. > But the asm snippet doesn't have the store to stack at [fp-112] > location, so it could be a red herring. > > Are you using the same llvm during bisect? I'm compiling the test case once and then invoke it via git bisect run, so the BPF should be identical. clang-11. > The commit 354e8f1970f8 should be harmless > (when commit f30d4968e9ae ("bpf: Do not reject when the stack read > size is different from the tracked scalar size")) > is also applied. That fix is in bpf tree only, so far. I did some more tests. TL;DR: commit 3e8ce29850f1 ("bpf: Prevent pointer mismatch in bpf_timer_init.") is the first one that fails with "BPF program too large", it's ancestor loads OK. commit 354e8f1970f8 ("bpf: Support <8-byte scalar spill and refill") makes verification fail earlier with "corrupted spill memory". The following solves both issues: git checkout 354e8f1970f8 # "bpf: Support <8-byte scalar spill and refill" git cherry-pick f30d4968e9ae # "bpf: Do not reject when the stack read size is different from the tracked scalar size" I think you're on the money wrt scalar spill tracking. Maybe I misattributed the problem to the wrong bit of code, instead of having found the wrong commit? Details: bpf-next: commit be2f2d1680df ("libbpf: Deprecate bpf_program__load() API"): ; v = *pos++; 1099: (79) r1 = *(u64 *)(r10 -72) corrupted spill memory processed 48649 insns (limit 1000000) max_states_per_insn 4 total_states 1305 peak_states 290 mark_read 53 bpf-next with f30d4968e9ae on top: works! bpf-next with commit 354e8f1970f8 ("bpf: Support <8-byte scalar spill and refill") reverted: 2225: (05) goto pc+13 BPF program is too large. Processed 1000001 insn processed 1000001 insns (limit 1000000) max_states_per_insn 28 total_states 40641 peak_states 1104 mark_read 53 commit 3e8ce29850f1 ("bpf: Prevent pointer mismatch in bpf_timer_init.") (found via bisection): BPF program is too large. Processed 1000001 insn commit 3e8ce29850f1^ ("bpf: Add map side support for bpf timers."): works! commit 3e8ce29850f1 with commit 354e8f1970f8 ("bpf: Support <8-byte scalar spill and refill") reverted: doesn't revert cleanly commit 354e8f1970f8 ("bpf: Support <8-byte scalar spill and refill"): corrupted spill memory commit 354e8f1970f8^ ("bpf: Check the other end of slot_type for STACK_SPILL"): 2225: (05) goto pc+13 BPF program is too large. Processed 1000001 insn processed 1000001 insns (limit 1000000) max_states_per_insn 28 total_states 40641 peak_states 1104 mark_read 53 commit 354e8f1970f8~2 ("selftests/bpf: Fix btf_dump __int128 test failure with clang build kernel"): same as above -- Lorenz Bauer | Systems Engineer 6th Floor, County Hall/The Riverside Building, SE1 7PB, UK www.cloudflare.com