On Sun, Apr 16, 2023 at 04:28:08PM -0700, Yonghong Song wrote: > In [1], I tried to remove bpf-specific codes to prevent certain > llvm optimizations, and add llvm TTI (target transform info) hooks > to prevent those optimizations. During this process, I found > if I enable llvm SimplifyCFG:shouldFoldTwoEntryPHINode > transformation, I will hit the following verification failure with selftests: > > ... > 8: (18) r1 = 0xffffc900001b2230 ; R1_w=map_value(off=560,ks=4,vs=564,imm=0) > 10: (61) r1 = *(u32 *)(r1 +0) ; R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) > ; if (skb->tstamp == EGRESS_ENDHOST_MAGIC) > 11: (79) r2 = *(u64 *)(r6 +152) ; R2_w=scalar() R6=ctx(off=0,imm=0) > ; if (skb->tstamp == EGRESS_ENDHOST_MAGIC) > 12: (55) if r2 != 0xb9fbeef goto pc+10 ; R2_w=195018479 > 13: (bc) w2 = w1 ; R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) > ; if (test < __NR_TESTS) > 14: (a6) if w1 < 0x9 goto pc+1 16: R0=2 R1_w=scalar(umax=8,var_off=(0x0; 0xf)) R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(off=0,imm=0) R10=fp0 > ; > 16: (27) r2 *= 28 ; R2_w=scalar(umax=120259084260,var_off=(0x0; 0x1ffffffffc),s32_max=2147483644,u32_max=-4) > 17: (18) r3 = 0xffffc900001b2118 ; R3_w=map_value(off=280,ks=4,vs=564,imm=0) > 19: (0f) r3 += r2 ; R2_w=scalar(umax=120259084260,var_off=(0x0; 0x1ffffffffc),s32_max=2147483644,u32_max=-4) R3_w=map_value(off=280,ks=4,vs=564,umax=120259084260,var_off=(0x0; 0x1ffffffffc),s32_max=2147483644,u32_max=-4) > 20: (61) r2 = *(u32 *)(r3 +0) > R3 unbounded memory access, make sure to bounds check any such access > processed 97 insns (limit 1000000) max_states_per_insn 1 total_states 10 peak_states 10 mark_read 6 > -- END PROG LOAD LOG -- > libbpf: prog 'ingress_fwdns_prio100': failed to load: -13 > libbpf: failed to load object 'test_tc_dtime' > libbpf: failed to load BPF skeleton 'test_tc_dtime': -13 > ... > > At insn 14, with condition 'w1 < 9', register r1 is changed from an arbitrary > u32 value to `scalar(umax=8,var_off=(0x0; 0xf))`. Register r2, however, remains > as an arbitrary u32 value. Current verifier won't claim r1/r2 equality if > the previous mov is alu32 ('w2 = w1'). > > If r1 upper 32bit value is not 0, we indeed cannot clamin r1/r2 equality > after 'w2 = w1'. But in this particular case, we know r1 upper 32bit value > is 0, so it is safe to claim r1/r2 equality. This patch exactly did this. > For a 32bit subreg mov, if the src register upper 32bit is 0, > it is okay to claim equality between src and dst registers. Perhaps mention in the above paragraph that this works because 32-bit ALU operations clear the upper bits? Some along the line of A special case where r1/r2 equality can be claimed after 'w2 = w1' is when r1 upper 32bit value is 0. This is because 32bit ALU operations always clear the upper 32 bits of the destination, so 'w2 = w1' in this case is the same as 'r2 = r1'... > With this patch, the above verification sequence becomes > > ... > 8: (18) r1 = 0xffffc9000048e230 ; R1_w=map_value(off=560,ks=4,vs=564,imm=0) > 10: (61) r1 = *(u32 *)(r1 +0) ; R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) > ; if (skb->tstamp == EGRESS_ENDHOST_MAGIC) > 11: (79) r2 = *(u64 *)(r6 +152) ; R2_w=scalar() R6=ctx(off=0,imm=0) > ; if (skb->tstamp == EGRESS_ENDHOST_MAGIC) > 12: (55) if r2 != 0xb9fbeef goto pc+10 ; R2_w=195018479 > 13: (bc) w2 = w1 ; R1_w=scalar(id=6,umax=4294967295,var_off=(0x0; 0xffffffff)) R2_w=scalar(id=6,umax=4294967295,var_off=(0x0; 0xffffffff)) > ; if (test < __NR_TESTS) > 14: (a6) if w1 < 0x9 goto pc+1 ; R1_w=scalar(id=6,umin=9,umax=4294967295,var_off=(0x0; 0xffffffff)) > ... > from 14 to 16: R0=2 R1_w=scalar(id=6,umax=8,var_off=(0x0; 0xf)) R2_w=scalar(id=6,umax=8,var_off=(0x0; 0xf)) R6=ctx(off=0,imm=0) R10=fp0 > 16: (27) r2 *= 28 ; R2_w=scalar(umax=224,var_off=(0x0; 0xfc)) > 17: (18) r3 = 0xffffc9000048e118 ; R3_w=map_value(off=280,ks=4,vs=564,imm=0) > 19: (0f) r3 += r2 > 20: (61) r2 = *(u32 *)(r3 +0) ; R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R3_w=map_value(off=280,ks=4,vs=564,umax=224,var_off=(0x0; 0xfc),s32_max=252,u32_max=252) > ... > > and eventually the bpf program can be verified successfully. > > [1] https://reviews.llvm.org/D147968 > > Signed-off-by: Yonghong Song <yhs@xxxxxx> Acked-by: Shung-Hsi Yu <shung-hsi.yu@xxxxxxxx> > --- > kernel/bpf/verifier.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index d6db6de3e9ea..468f002d3248 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -12409,12 +12409,17 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) > insn->src_reg); > return -EACCES; > } else if (src_reg->type == SCALAR_VALUE) { > + bool is_src_reg_u32 = src_reg->umax_value <= U32_MAX; > + > + if (is_src_reg_u32 && !src_reg->id) > + src_reg->id = ++env->id_gen; > copy_register_state(dst_reg, src_reg); > - /* Make sure ID is cleared otherwise > + /* Make sure ID is cleared if src_reg is not in u32 range otherwise > * dst_reg min/max could be incorrectly > * propagated into src_reg by find_equal_scalars() > */ > - dst_reg->id = 0; > + if (!is_src_reg_u32) > + dst_reg->id = 0; > dst_reg->live |= REG_LIVE_WRITTEN; > dst_reg->subreg_def = env->insn_idx + 1; > } else { > -- > 2.34.1 >