On Mon, Apr 20, 2020 at 10:11 PM Andrii Nakryiko <andriin@xxxxxx> wrote: > > To make BPF verifier verbose log more releavant and easier to use to debug > verification failures, "pop" parts of log that were successfully verified. > This has effect of leaving only verifier logs that correspond to code branches > that lead to verification failure, which in practice should result in much > shorter and more relevant verifier log dumps. This behavior is made the > default behavior and can be overriden to do exhaustive logging by specifying > BPF_LOG_LEVEL2 log level. > > Using BPF_LOG_LEVEL2 to disable this behavior is not ideal, because in some > cases it's good to have BPF_LOG_LEVEL2 per-instruction register dump > verbosity, but still have only relevant verifier branches logged. But for this > patch, I didn't want to add any new flags. It might be worth-while to just > rethink how BPF verifier logging is performed and requested and streamline it > a bit. But this trimming of successfully verified branches seems to be useful > and a good default behavior. > > To test this, I modified runqslower slightly to introduce read of > uninitialized stack variable. Log (**truncated in the middle** to save many > lines out of this commit message) BEFORE this change: > > ; int handle__sched_switch(u64 *ctx) > 0: (bf) r6 = r1 > ; struct task_struct *prev = (struct task_struct *)ctx[1]; > 1: (79) r1 = *(u64 *)(r6 +8) > func 'sched_switch' arg1 has btf_id 151 type STRUCT 'task_struct' > 2: (b7) r2 = 0 > ; struct event event = {}; > 3: (7b) *(u64 *)(r10 -24) = r2 > last_idx 3 first_idx 0 > regs=4 stack=0 before 2: (b7) r2 = 0 > 4: (7b) *(u64 *)(r10 -32) = r2 > 5: (7b) *(u64 *)(r10 -40) = r2 > 6: (7b) *(u64 *)(r10 -48) = r2 > ; if (prev->state == TASK_RUNNING) > > [ ... instructions from insn #7 through #50 are cut out ... ] > > 51: (b7) r2 = 16 > 52: (85) call bpf_get_current_comm#16 > last_idx 52 first_idx 42 > regs=4 stack=0 before 51: (b7) r2 = 16 > ; bpf_perf_event_output(ctx, &events, BPF_F_CURRENT_CPU, > 53: (bf) r1 = r6 > 54: (18) r2 = 0xffff8881f3868800 > 56: (18) r3 = 0xffffffff > 58: (bf) r4 = r7 > 59: (b7) r5 = 32 > 60: (85) call bpf_perf_event_output#25 > last_idx 60 first_idx 53 > regs=20 stack=0 before 59: (b7) r5 = 32 > 61: (bf) r2 = r10 > ; event.pid = pid; > 62: (07) r2 += -16 > ; bpf_map_delete_elem(&start, &pid); > 63: (18) r1 = 0xffff8881f3868000 > 65: (85) call bpf_map_delete_elem#3 > ; } > 66: (b7) r0 = 0 > 67: (95) exit > > from 44 to 66: safe > > from 34 to 66: safe > > from 11 to 28: R1_w=inv0 R2_w=inv0 R6_w=ctx(id=0,off=0,imm=0) R10=fp0 fp-8=mmmm???? fp-24_w=00000000 fp-32_w=00000000 fp-40_w=00000000 fp-48_w=00000000 > ; bpf_map_update_elem(&start, &pid, &ts, 0); > 28: (bf) r2 = r10 > ; > 29: (07) r2 += -16 > ; tsp = bpf_map_lookup_elem(&start, &pid); > 30: (18) r1 = 0xffff8881f3868000 > 32: (85) call bpf_map_lookup_elem#1 > invalid indirect read from stack off -16+0 size 4 > processed 65 insns (limit 1000000) max_states_per_insn 1 total_states 5 peak_states 5 mark_read 4 > > Notice how there is a successful code path from instruction 0 through 67, few > successfully verified jumps (44->66, 34->66), and only after that 11->28 jump > plus error on instruction #32. > > AFTER this change (full verifier log, **no truncation**): > > ; int handle__sched_switch(u64 *ctx) > 0: (bf) r6 = r1 > ; struct task_struct *prev = (struct task_struct *)ctx[1]; > 1: (79) r1 = *(u64 *)(r6 +8) > func 'sched_switch' arg1 has btf_id 151 type STRUCT 'task_struct' > 2: (b7) r2 = 0 > ; struct event event = {}; > 3: (7b) *(u64 *)(r10 -24) = r2 > last_idx 3 first_idx 0 > regs=4 stack=0 before 2: (b7) r2 = 0 > 4: (7b) *(u64 *)(r10 -32) = r2 > 5: (7b) *(u64 *)(r10 -40) = r2 > 6: (7b) *(u64 *)(r10 -48) = r2 > ; if (prev->state == TASK_RUNNING) > 7: (79) r2 = *(u64 *)(r1 +16) > ; if (prev->state == TASK_RUNNING) > 8: (55) if r2 != 0x0 goto pc+19 > R1_w=ptr_task_struct(id=0,off=0,imm=0) R2_w=inv0 R6_w=ctx(id=0,off=0,imm=0) R10=fp0 fp-24_w=00000000 fp-32_w=00000000 fp-40_w=00000000 fp-48_w=00000000 > ; trace_enqueue(prev->tgid, prev->pid); > 9: (61) r1 = *(u32 *)(r1 +1184) > 10: (63) *(u32 *)(r10 -4) = r1 > ; if (!pid || (targ_pid && targ_pid != pid)) > 11: (15) if r1 == 0x0 goto pc+16 > > from 11 to 28: R1_w=inv0 R2_w=inv0 R6_w=ctx(id=0,off=0,imm=0) R10=fp0 fp-8=mmmm???? fp-24_w=00000000 fp-32_w=00000000 fp-40_w=00000000 fp-48_w=00000000 > ; bpf_map_update_elem(&start, &pid, &ts, 0); > 28: (bf) r2 = r10 > ; > 29: (07) r2 += -16 > ; tsp = bpf_map_lookup_elem(&start, &pid); > 30: (18) r1 = 0xffff8881db3ce800 > 32: (85) call bpf_map_lookup_elem#1 > invalid indirect read from stack off -16+0 size 4 > processed 65 insns (limit 1000000) max_states_per_insn 1 total_states 5 peak_states 5 mark_read 4 > > Notice how in this case, there are 0-11 instructions + jump from 11 to > 28 is recorded + 28-32 instructions with error on insn #32. > > Signed-off-by: Andrii Nakryiko <andriin@xxxxxx> This is great idea! But two test_verifier tests failed: #722/p precise: ST insn causing spi > allocated_stack FAIL Unexpected verifier log in successful load! EXP: 5: (2d) if r4 > r0 goto pc+0 RES: 0: (bf) r3 = r10 1: (55) if r3 != 0x7b goto pc+0 from 1 to 2: safe processed 8 insns (limit 1000000) max_states_per_insn 0 total_states 4 peak_states 4 mark_read 1 Please fix them up.