On Fri, May 12, 2023 at 3:16 PM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > On Fri, May 12, 2023 at 11:55 AM Alexei Starovoitov > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > Andrii, > > > > Here is what I see on the latest bpf-next: > > > > ./test_progs -t global_funcs > > [ 7.969549] bpf_testmod: loading out-of-tree module taints kernel. > > [ 7.979444] ------------[ cut here ]------------ > > [ 7.979812] verifier backtracking bug > > [ 7.979828] WARNING: CPU: 1 PID: 2026 at kernel/bpf/verifier.c:3500 > > __mark_chain_precision+0xd8d/0xda0 > > [ 7.980818] Modules linked in: bpf_testmod(O) > > [ 7.981161] CPU: 1 PID: 2026 Comm: test_progs Tainted: G > > O 6.3.0-07968-g7b99f75942da #4614 > > [ 7.981876] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > > BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > > [ 7.982732] RIP: 0010:__mark_chain_precision+0xd8d/0xda0 > > [ 7.983140] Code: ff e9 fb f4 ff ff 80 3d e2 c5 50 02 00 0f 85 15 > > fd ff ff 48 c7 c7 fe 5b 5c 82 4c 89 0c 24 c6 05 ca c5 50 02 01 e8 b3 > > ed e8 ff <0f> 0b 4c 8b 0c 24 e9 f3 fc ff ff 0f4 > > [ 7.984523] RSP: 0018:ffffc90002bb78f0 EFLAGS: 00010282 > > [ 7.984918] RAX: 0000000000000019 RBX: ffff88810137c000 RCX: 0000000000000002 > > [ 7.985467] RDX: 0000000080000002 RSI: ffffffff825bda2c RDI: 00000000ffffffff > > [ 7.986011] RBP: 00000000ffffffff R08: 0000000000000000 R09: c0000000fffeffff > > [ 7.986553] R10: 0000000000000001 R11: ffffc90002bb77a8 R12: 000000000000001b > > [ 7.987093] R13: 0000000000000002 R14: 0000000000000010 R15: 000000000000001c > > [ 7.987641] FS: 00007f7bd27d7400(0000) GS:ffff888237a40000(0000) > > knlGS:0000000000000000 > > [ 7.988254] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 7.988687] CR2: 000000000511e078 CR3: 000000010512f005 CR4: 00000000003706e0 > > [ 7.989228] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > [ 7.989765] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > [ 7.990306] Call Trace: > > [ 7.990500] <TASK> > > [ 7.990668] ? check_helper_mem_access+0xf9/0x2a0 > > [ 7.991035] ? btf_type_name+0x20/0x20 > > [ 7.991329] ? find_kfunc_desc_btf.part.106+0x210/0x210 > > [ 7.991723] check_stack_write_fixed_off+0x437/0x610 > > [ 7.992113] ? lock_acquire+0x15c/0x290 > > [ 7.992416] ? adjust_reg_min_max_vals+0xdf/0x1070 > > [ 7.992778] ? __kmem_cache_alloc_node+0x41/0x530 > > [ 7.993140] ? check_ptr_alignment+0x7d/0x210 > > [ 7.993479] ? lock_release+0x1b7/0x250 > > [ 7.993774] check_mem_access+0x8fc/0x1750 > > > > Looks like my earlier suggestion to do: > > WARN_ONCE(idx + 1 != subseq_idx, "verifier backtracking bug"); > > > > is tripping on something. > > Interesting... I did check dmesg after adding this check, strange. > I'll try to repro later today and see what's up, thanks for heads up! To close the loop, this was fixed in [0]. The problem was that subseq_idx is not always preserved correctly when traversing between states. [0] https://patchwork.kernel.org/project/netdevbpf/patch/20230515180710.1535018-1-andrii@xxxxxxxxxx/