On Wed, Apr 24, 2024 at 3:41 PM Cupertino Miranda <cupertino.miranda@xxxxxxxxxx> wrote: > > Split range computation checks in its own function, isolating pessimitic > range set for dst_reg and failing return to a single point. > > Signed-off-by: Cupertino Miranda <cupertino.miranda@xxxxxxxxxx> > Cc: Yonghong Song <yonghong.song@xxxxxxxxx> > Cc: Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> > Cc: David Faust <david.faust@xxxxxxxxxx> > Cc: Jose Marchesi <jose.marchesi@xxxxxxxxxx> > Cc: Elena Zannoni <elena.zannoni@xxxxxxxxxx> > --- > kernel/bpf/verifier.c | 141 +++++++++++++++++++++++------------------- > 1 file changed, 77 insertions(+), 64 deletions(-) > I know you are moving around pre-existing code, so a bunch of nits below are to pre-existing code, but let's use this as an opportunity to clean it up a bit. > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 6fe641c8ae33..829a12d263a5 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -13695,6 +13695,82 @@ static void scalar_min_max_arsh(struct bpf_reg_state *dst_reg, > __update_reg_bounds(dst_reg); > } > > +static bool is_const_reg_and_valid(struct bpf_reg_state reg, bool alu32, hm.. why passing reg_state by value? Use pointer? > + bool *valid) > +{ > + s64 smin_val = reg.smin_value; > + s64 smax_val = reg.smax_value; > + u64 umin_val = reg.umin_value; > + u64 umax_val = reg.umax_value; > + don't add empty line between variable declarations, all variables should be in a single continuous block > + s32 s32_min_val = reg.s32_min_value; > + s32 s32_max_val = reg.s32_max_value; > + u32 u32_min_val = reg.u32_min_value; > + u32 u32_max_val = reg.u32_max_value; > + but see below, I'm not sure we even need these local variables, they don't save all that much typing > + bool known = alu32 ? tnum_subreg_is_const(reg.var_off) : > + tnum_is_const(reg.var_off); "known" is a misnomer, imo. It's `is_const`. > + > + if (alu32) { > + if ((known && > + (s32_min_val != s32_max_val || u32_min_val != u32_max_val)) || > + s32_min_val > s32_max_val || u32_min_val > u32_max_val) > + *valid = false; > + } else { > + if ((known && > + (smin_val != smax_val || umin_val != umax_val)) || > + smin_val > smax_val || umin_val > umax_val) > + *valid = false; > + } > + > + return known; The above is really hard to follow, especially how known && !known cases are being handled is very easy to misinterpret. How about we rewrite the equivalent logic in a few steps: if (alu32) { if (tnum_subreg_is_const(reg.var_off)) { return reg->s32_min_value == reg->s32_max_value && reg->u32_min_value == reg->u32_max_value; } else { return reg->s32_min_value <= reg->s32_max_value && reg->u32_min_value <= reg->u32_max_value; } } else { /* same as above for 64-bit bounds */ } And you don't even need any local variables, while all the important conditions are a bit more easy to follow? Or is it just me? > +} > + > +static bool is_safe_to_compute_dst_reg_range(struct bpf_insn *insn, > + struct bpf_reg_state src_reg) > +{ > + bool src_known; > + u64 insn_bitness = (BPF_CLASS(insn->code) == BPF_ALU64) ? 64 : 32; whole u64 for this seems like an overkill, I'd just stick to `int` > + bool alu32 = (BPF_CLASS(insn->code) != BPF_ALU64); insn_bitness == 32 ? > + u8 opcode = BPF_OP(insn->code); > + nit: don't split variables block with empty line > + bool valid_known = true; need an empty line between variable declarations and the rest > + src_known = is_const_reg_and_valid(src_reg, alu32, &valid_known); > + > + /* Taint dst register if offset had invalid bounds > + * derived from e.g. dead branches. > + */ > + if (valid_known == false) nit: !valid_known > + return false; given this logic/handling, why not just return false from is_const_reg_and_valid() if it's a constant, but it's not valid/consistent? It's simpler and fits the logic and function's name, no? See my suggestion above > + > + switch (opcode) { inline opcode variable here, you use it just once > + case BPF_ADD: > + case BPF_SUB: > + case BPF_AND: > + return true; > + > + /* Compute range for the following only if the src_reg is known. > + */ > + case BPF_XOR: > + case BPF_OR: > + case BPF_MUL: > + return src_known; > + > + /* Shift operators range is only computable if shift dimension operand > + * is known. Also, shifts greater than 31 or 63 are undefined. This > + * includes shifts by a negative number. > + */ > + case BPF_LSH: > + case BPF_RSH: > + case BPF_ARSH: preserve original comment? > - /* Shifts greater than 31 or 63 are undefined. > - * This includes shifts by a negative number. > - */ > + return (src_known && src_reg.umax_value < insn_bitness); nit: unnecessary () > + default: > + break; return false here, and drop return below > + } > + > + return false; > +} > + > /* WARNING: This function does calculations on 64-bit values, but the actual > * execution may occur on 32-bit values. Therefore, things like bitshifts > * need extra checks in the 32-bit case. [...]