On Mon, Mar 4, 2024 at 3:55 AM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > On Fri, 2024-03-01 at 18:00 -0800, Alexei Starovoitov wrote: > > From: Alexei Starovoitov <ast@xxxxxxxxxx> > > > > Introduce may_goto instruction that acts on a hidden bpf_iter_num, so that > > bpf_iter_num_new(), bpf_iter_num_destroy() don't need to be called explicitly. > > It can be used in any normal "for" or "while" loop, like > > > > for (i = zero; i < cnt; cond_break, i++) { > > > > The verifier recognizes that may_goto is used in the program, > > reserves additional 8 bytes of stack, initializes them in subprog > > prologue, and replaces may_goto instruction with: > > aux_reg = *(u64 *)(fp - 40) > > if aux_reg == 0 goto pc+off > > aux_reg += 1 > > *(u64 *)(fp - 40) = aux_reg > > [...] > > Modulo instruction validation issue you pointed out offlist I don't > see any obvious issues with this patch. Two notes below. > > [...] > > > @@ -19406,7 +19466,10 @@ static int do_misc_fixups(struct bpf_verifier_env *env) > > struct bpf_insn insn_buf[16]; > > struct bpf_prog *new_prog; > > struct bpf_map *map_ptr; > > - int i, ret, cnt, delta = 0; > > + int i, ret, cnt, delta = 0, cur_subprog = 0; > > + struct bpf_subprog_info *subprogs = env->subprog_info; > > + u16 stack_depth = subprogs[cur_subprog].stack_depth; > > + u16 stack_depth_extra = 0; > > Note: optimize_bpf_loop() has very similar logic, > but there stack_depth is rounded up: > > u16 stack_depth = subprogs[cur_subprog].stack_depth; > u16 stack_depth_roundup = round_up(stack_depth, 8) - stack_depth; > u16 stack_depth_extra = 0; > ... > stack_depth_extra = BPF_REG_SIZE * 3 + stack_depth_roundup; > > And stack base for tmp variables is computed as "-(stack_depth + stack_depth_extra)". I did a sloppy review when I landed your commit 1ade23711971 ("bpf: Inline calls to bpf_loop when callback is known") round_up is unnecessary. The stack is always incremented in power of 8. See grow_stack_state(). > > > > if (env->seen_exception && !env->exception_callback_subprog) { > > struct bpf_insn patch[] = { > > @@ -19426,7 +19489,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env) > > mark_subprog_exc_cb(env, env->exception_callback_subprog); > > } > > > > - for (i = 0; i < insn_cnt; i++, insn++) { > > + for (i = 0; i < insn_cnt;) { > > /* Make divide-by-zero exceptions impossible. */ > > if (insn->code == (BPF_ALU64 | BPF_MOD | BPF_X) || > > insn->code == (BPF_ALU64 | BPF_DIV | BPF_X) || > > [...] > > > @@ -19950,6 +20033,39 @@ static int do_misc_fixups(struct bpf_verifier_env *env) > > return -EFAULT; > > } > > insn->imm = fn->func - __bpf_call_base; > > +next_insn: > > + if (subprogs[cur_subprog + 1].start == i + delta + 1) { > > + subprogs[cur_subprog].stack_depth += stack_depth_extra; > > + subprogs[cur_subprog].stack_extra = stack_depth_extra; > > + cur_subprog++; > > + stack_depth = subprogs[cur_subprog].stack_depth; > > + stack_depth_extra = 0; > > + } > > + i++; insn++; > > + } > > + > > + env->prog->aux->stack_depth = subprogs[0].stack_depth; > > + for (i = 0; i < env->subprog_cnt; i++) { > > + int subprog_start = subprogs[i].start, j; > > + int stack_slots = subprogs[i].stack_extra / 8; > > + > > + if (stack_slots >= ARRAY_SIZE(insn_buf)) { > > + verbose(env, "verifier bug: stack_extra is too large\n"); > > + return -EFAULT; > > + } > > + > > + /* Add insns to subprog prologue to zero init extra stack */ > > + for (j = 0; j < stack_slots; j++) > > + insn_buf[j] = BPF_ST_MEM(BPF_DW, BPF_REG_FP, > > + -subprogs[i].stack_depth + j * 8, BPF_MAX_LOOPS); > > Nit: the comment says that stack is zero initialized, > while it is actually set to BPF_MAX_LOOPS. Good catch. Will fix.