On Tue, Apr 25, 2023 at 04:49:07PM -0700, Andrii Nakryiko wrote: > Fix propagate_precision() logic to perform propagation of all necessary > registers and stack slots across all active frames *in one batch step*. > > Doing this for each register/slot in each individual frame is wasteful, > but the main problem is that backtracking of instruction in any frame > except the deepest one just doesn't work. This is due to backtracking > logic relying on jump history, and available jump history always starts > (or ends, depending how you view it) in current frame. So, if > prog A (frame #0) called subprog B (frame #1) and we need to propagate > precision of, say, register R6 (callee-saved) within frame #0, we > actually don't even know where jump history that corresponds to prog > A even starts. We'd need to skip subprog part of jump history first to > be able to do this. > > Luckily, with struct backtrack_state and __mark_chain_precision() > handling bitmasks tracking/propagation across all active frames at the > same time (added in previous patch), propagate_precision() can be both > fixed and sped up by setting all the necessary bits across all frames > and then performing one __mark_chain_precision() pass. This makes it > unnecessary to skip subprog parts of jump history. > > We also improve logging along the way, to clearly specify which > registers' and slots' precision markings are propagated within which > frame. > > Fixes: 529409ea92d5 ("bpf: propagate precision across all frames, not just the last one") > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > --- > kernel/bpf/verifier.c | 49 +++++++++++++++++++++++++++---------------- > 1 file changed, 31 insertions(+), 18 deletions(-) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 0b19b3d9af65..66d64ac10fb1 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -3885,14 +3885,12 @@ int mark_chain_precision(struct bpf_verifier_env *env, int regno) > return __mark_chain_precision(env, env->cur_state->curframe, regno, -1); > } > > -static int mark_chain_precision_frame(struct bpf_verifier_env *env, int frame, int regno) > +static int mark_chain_precision_batch(struct bpf_verifier_env *env, int frame) > { > - return __mark_chain_precision(env, frame, regno, -1); > -} > - > -static int mark_chain_precision_stack_frame(struct bpf_verifier_env *env, int frame, int spi) > -{ > - return __mark_chain_precision(env, frame, -1, spi); > + /* we assume env->bt was set outside with desired reg and stack masks > + * for all frames > + */ > + return __mark_chain_precision(env, frame, -1, -1); > } > > static bool is_spillable_regtype(enum bpf_reg_type type) > @@ -15308,20 +15306,25 @@ static int propagate_precision(struct bpf_verifier_env *env, > struct bpf_reg_state *state_reg; > struct bpf_func_state *state; > int i, err = 0, fr; > + bool first; > > for (fr = old->curframe; fr >= 0; fr--) { > state = old->frame[fr]; > state_reg = state->regs; > + first = true; > for (i = 0; i < BPF_REG_FP; i++, state_reg++) { > if (state_reg->type != SCALAR_VALUE || > !state_reg->precise || > !(state_reg->live & REG_LIVE_READ)) > continue; > - if (env->log.level & BPF_LOG_LEVEL2) > - verbose(env, "frame %d: propagating r%d\n", fr, i); > - err = mark_chain_precision_frame(env, fr, i); > - if (err < 0) > - return err; > + if (env->log.level & BPF_LOG_LEVEL2) { > + if (first) > + verbose(env, "frame %d: propagating r%d", fr, i); > + else > + verbose(env, ",r%d", i); > + } > + bt_set_frame_reg(&env->bt, fr, i); > + first = false; > } > > for (i = 0; i < state->allocated_stack / BPF_REG_SIZE; i++) { > @@ -15332,14 +15335,24 @@ static int propagate_precision(struct bpf_verifier_env *env, > !state_reg->precise || > !(state_reg->live & REG_LIVE_READ)) > continue; > - if (env->log.level & BPF_LOG_LEVEL2) > - verbose(env, "frame %d: propagating fp%d\n", > - fr, (-i - 1) * BPF_REG_SIZE); > - err = mark_chain_precision_stack_frame(env, fr, i); > - if (err < 0) > - return err; > + if (env->log.level & BPF_LOG_LEVEL2) { > + if (first) > + verbose(env, "frame %d: propagating fp%d", > + fr, (-i - 1) * BPF_REG_SIZE); > + else > + verbose(env, ",fp%d", (-i - 1) * BPF_REG_SIZE); > + } > + bt_set_frame_slot(&env->bt, fr, i); > + first = false; > } > + if (!first) > + verbose(env, "\n"); > } > + > + err = mark_chain_precision_batch(env, old->curframe); The optimization makes sense, sort-of, but 'first' flag is to separate frames on different lines ? The code in propagate_precision() before mark_chain_precision_batch() will only print '... propagating...' few times. regs and stack will be on one line anyway. Is it that ugly to print all frames on one line in practice? The 2nd and 3rd frames will be on one line too. Only 1st frame is on its own line? I'm not getting the idea of 'first'.