On Thu, Dec 16, 2021 at 7:48 AM Christy Lee <christyc.y.lee@xxxxxxxxx> wrote: Apologies, resending this because the previous email was not plain text. > > > On Wed, Dec 15, 2021 at 11:02 PM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: >> >> On Wed, Dec 15, 2021 at 11:23 AM Christy Lee <christylee@xxxxxx> wrote: >> > [...] >> >> >> Looks good to me, but see a few nits below. They can be ignored or >> addressed as a follow up. >> >> >> Acked-by: Andrii Nakryiko <andrii@xxxxxxxxxx> >> >> > include/linux/bpf_verifier.h | 7 +++ >> > kernel/bpf/verifier.c | 62 ++++++++++++++++++- >> > .../testing/selftests/bpf/prog_tests/align.c | 30 ++++----- >> > 3 files changed, 82 insertions(+), 17 deletions(-) >> > >> > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h >> > index 182b16a91084..c555222c97d6 100644 >> > --- a/include/linux/bpf_verifier.h >> > +++ b/include/linux/bpf_verifier.h >> > @@ -474,6 +474,13 @@ struct bpf_verifier_env { >> > /* longest register parentage chain walked for liveness marking */ >> > u32 longest_mark_read_walk; >> > bpfptr_t fd_array; >> > + >> > + /* bit mask to keep track of whether a register has been accessed >> > + * since the last time the function state was pritned >> >> typo: printed >> >> > + */ >> > + u32 scratched_regs; >> > + /* Same as scratched_regs but for stack slots */ >> > + u64 scratched_stack_slots; >> > }; >> > >> > __printf(2, 0) void bpf_verifier_vlog(struct bpf_verifier_log *log, >> >> [...] >> >> > + mark_stack_slot_scratched(env, spi); >> > if (reg && !(off % BPF_REG_SIZE) && register_is_bounded(reg) && >> > !register_is_null(reg) && env->bpf_capable) { >> > if (dst_reg != BPF_REG_FP) { >> > @@ -2957,6 +3004,7 @@ static int check_stack_write_var_off(struct bpf_verifier_env *env, >> > slot = -i - 1; >> > spi = slot / BPF_REG_SIZE; >> > stype = &state->stack[spi].slot_type[slot % BPF_REG_SIZE]; >> > + mark_stack_slot_scratched(env, spi); >> > >> > if (!env->allow_ptr_leaks >> > && *stype != NOT_INIT >> > @@ -6009,8 +6057,10 @@ static int __check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn >> > *insn_idx = env->subprog_info[subprog].start - 1; >> > >> > if (env->log.level & BPF_LOG_LEVEL) { >> > + mark_verifier_state_scratched(env); >> > verbose(env, "caller:\n"); >> > print_verifier_state(env, caller); >> >> In all but one cases you call mark_verifier_state_scratched(env) >> before calling print_verifier_state(). Instead of sort of artificially >> scratch entire state, I'd add a bool flag to print_verifier_state() to >> control whether we want to take into account scratch masks or not. It >> would also make sure that we will never forget to scratch it where >> necessary (you'll have to make a conscious decision on each >> print_verifier_state() call). > > Good idea! I'll do that. >> >> >> > + mark_verifier_state_scratched(env); >> > verbose(env, "callee:\n"); >> > print_verifier_state(env, callee); >> > } >> >> [...] >> >> > @@ -161,13 +161,13 @@ static struct bpf_align_test tests[] = { >> > }, >> > .prog_type = BPF_PROG_TYPE_SCHED_CLS, >> > .matches = { >> > - {7, "R0_w=pkt(id=0,off=8,r=8,imm=0)"}, >> > + {6, "R0_w=pkt(id=0,off=8,r=8,imm=0)"}, >> > {7, "R3_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"}, >> > {8, "R3_w=inv(id=0,umax_value=510,var_off=(0x0; 0x1fe))"}, >> > {9, "R3_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, >> > {10, "R3_w=inv(id=0,umax_value=2040,var_off=(0x0; 0x7f8))"}, >> > {11, "R3_w=inv(id=0,umax_value=4080,var_off=(0x0; 0xff0))"}, >> > These would be very useful features, I'll address them as a follow up. >> >> consider this a feature request (unless people disagree), but these >> "_value" suffixes for umin/umax/s32_min/etc are just noise and don't >> contribute anything but extra visual noise. I'd remove them. map_value >> is ok, probably (because we should have map_key, I guess). >> >> > - {18, "R3=pkt_end(id=0,off=0,imm=0)"}, >> > + {13, "R3_w=pkt_end(id=0,off=0,imm=0)"}, >> > {18, "R4_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"}, >> > {19, "R4_w=inv(id=0,umax_value=8160,var_off=(0x0; 0x1fe0))"}, >> > {20, "R4_w=inv(id=0,umax_value=4080,var_off=(0x0; 0xff0))"}, >> > @@ -234,10 +234,10 @@ static struct bpf_align_test tests[] = { >> > }, >> > .prog_type = BPF_PROG_TYPE_SCHED_CLS, >> > .matches = { >> > - {4, "R5_w=pkt(id=0,off=0,r=0,imm=0)"}, >> > + {3, "R5_w=pkt(id=0,off=0,r=0,imm=0)"}, >> >> another improvement suggestion: all these id=0 is also noise. I'd make >> sure that cases where we do care about ids always use id > 0 (I think >> that might be the case already) and just never output id=0 >> >> >> > {5, "R5_w=pkt(id=0,off=14,r=0,imm=0)"}, >> > {6, "R4_w=pkt(id=0,off=14,r=0,imm=0)"}, >> > - {10, "R2=pkt(id=0,off=0,r=18,imm=0)"}, >> > + {9, "R2=pkt(id=0,off=0,r=18,imm=0)"}, >> > {10, "R5=pkt(id=0,off=14,r=18,imm=0)"}, >> > {10, "R4_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"}, >> > {14, "R4_w=inv(id=0,umax_value=65535,var_off=(0x0; 0xffff))"}, >> >> Another thing that always confuses me is the use of "inv" to denote >> scalar. I could never understand why it's not a "scalar" or just >> nothing. Especially for cases when we have known value. We'll see >> "R4=inv4". So confusing. Maybe let's just drop the inv, so we'll have: >> >> R4=4 >> >> or (taking into the account all the above suggestions) >> >> R4=(umax=65535,var_off=(0x0; 0xffff)) >> >> I don't think we lose anything by dropping "inv", but "scalar" would >> be still better (even if slightly longer). >> >> >> > @@ -296,7 +296,7 @@ static struct bpf_align_test tests[] = { >> > /* Calculated offset in R6 has unknown value, but known >> > * alignment of 4. >> > */ >> > - {8, "R2_w=pkt(id=0,off=0,r=8,imm=0)"}, >> > + {6, "R2_w=pkt(id=0,off=0,r=8,imm=0)"}, >> > {8, "R6_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, >> > /* Offset is added to packet pointer R5, resulting in >> > * known fixed offset, and variable offset from R6. >> >> [...]