On Mon, Jun 13, 2022 at 1:50 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > + > +static bool loop_flag_is_zero(struct bpf_verifier_env *env) > +{ > + struct bpf_reg_state *regs = cur_regs(env); > + struct bpf_reg_state *reg = ®s[BPF_REG_4]; > + > + return register_is_const(reg) && reg->var_off.value == 0; > +} Great catch here by Daniel. It needs mark_chain_precision(). > + > +static void update_loop_inline_state(struct bpf_verifier_env *env, u32 subprogno) > +{ > + struct bpf_loop_inline_state *state = &cur_aux(env)->loop_inline_state; > + > + if (!state->initialized) { > + state->initialized = 1; > + state->fit_for_inline = loop_flag_is_zero(env); > + state->callback_subprogno = subprogno; > + return; > + } > + > + if (!state->fit_for_inline) > + return; > + > + state->fit_for_inline = > + loop_flag_is_zero(env) && > + state->callback_subprogno == subprogno; No need to heavy indent. Up to 100 char is fine. > +static int optimize_bpf_loop(struct bpf_verifier_env *env) > +{ > + struct bpf_subprog_info *subprogs = env->subprog_info; > + int i, cur_subprog = 0, cnt, delta = 0; > + struct bpf_insn *insn = env->prog->insnsi; > + int insn_cnt = env->prog->len; > + u16 stack_depth = subprogs[cur_subprog].stack_depth; > + u16 stack_depth_extra = 0; > + > + for (i = 0; i < insn_cnt; i++, insn++) { > + struct bpf_loop_inline_state *inline_state = > + &env->insn_aux_data[i + delta].loop_inline_state; > + > + if (is_bpf_loop_call(insn) && inline_state->fit_for_inline) { > + struct bpf_prog *new_prog; > + > + stack_depth_extra = BPF_REG_SIZE * 3; > + new_prog = inline_bpf_loop(env, > + i + delta, > + -(stack_depth + stack_depth_extra), See the fix that just landed: https://lore.kernel.org/bpf/20220616162037.535469-2-jakub@xxxxxxxxxxxxxx/ subprogs[cur_subprog].stack_depth may not be a multiple of 8. But spill slots for r[678] have to be. We need to round_up(,8) here and increase stack_depth_extra accordingly. The rest looks great. Thank you for working on it!