On Wed, May 3, 2023 at 8:14 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Tue, Apr 25, 2023 at 04:49:07PM -0700, Andrii Nakryiko wrote: > > Fix propagate_precision() logic to perform propagation of all necessary > > registers and stack slots across all active frames *in one batch step*. > > > > Doing this for each register/slot in each individual frame is wasteful, > > but the main problem is that backtracking of instruction in any frame > > except the deepest one just doesn't work. This is due to backtracking > > logic relying on jump history, and available jump history always starts > > (or ends, depending how you view it) in current frame. So, if > > prog A (frame #0) called subprog B (frame #1) and we need to propagate > > precision of, say, register R6 (callee-saved) within frame #0, we > > actually don't even know where jump history that corresponds to prog > > A even starts. We'd need to skip subprog part of jump history first to > > be able to do this. > > > > Luckily, with struct backtrack_state and __mark_chain_precision() > > handling bitmasks tracking/propagation across all active frames at the > > same time (added in previous patch), propagate_precision() can be both > > fixed and sped up by setting all the necessary bits across all frames > > and then performing one __mark_chain_precision() pass. This makes it > > unnecessary to skip subprog parts of jump history. > > > > We also improve logging along the way, to clearly specify which > > registers' and slots' precision markings are propagated within which > > frame. > > > > Fixes: 529409ea92d5 ("bpf: propagate precision across all frames, not just the last one") > > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > > --- > > kernel/bpf/verifier.c | 49 +++++++++++++++++++++++++++---------------- > > 1 file changed, 31 insertions(+), 18 deletions(-) > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index 0b19b3d9af65..66d64ac10fb1 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -3885,14 +3885,12 @@ int mark_chain_precision(struct bpf_verifier_env *env, int regno) > > return __mark_chain_precision(env, env->cur_state->curframe, regno, -1); > > } > > > > -static int mark_chain_precision_frame(struct bpf_verifier_env *env, int frame, int regno) > > +static int mark_chain_precision_batch(struct bpf_verifier_env *env, int frame) > > { > > - return __mark_chain_precision(env, frame, regno, -1); > > -} > > - > > -static int mark_chain_precision_stack_frame(struct bpf_verifier_env *env, int frame, int spi) > > -{ > > - return __mark_chain_precision(env, frame, -1, spi); > > + /* we assume env->bt was set outside with desired reg and stack masks > > + * for all frames > > + */ > > + return __mark_chain_precision(env, frame, -1, -1); > > } > > > > static bool is_spillable_regtype(enum bpf_reg_type type) > > @@ -15308,20 +15306,25 @@ static int propagate_precision(struct bpf_verifier_env *env, > > struct bpf_reg_state *state_reg; > > struct bpf_func_state *state; > > int i, err = 0, fr; > > + bool first; > > > > for (fr = old->curframe; fr >= 0; fr--) { > > state = old->frame[fr]; > > state_reg = state->regs; > > + first = true; > > for (i = 0; i < BPF_REG_FP; i++, state_reg++) { > > if (state_reg->type != SCALAR_VALUE || > > !state_reg->precise || > > !(state_reg->live & REG_LIVE_READ)) > > continue; > > - if (env->log.level & BPF_LOG_LEVEL2) > > - verbose(env, "frame %d: propagating r%d\n", fr, i); > > - err = mark_chain_precision_frame(env, fr, i); > > - if (err < 0) > > - return err; > > + if (env->log.level & BPF_LOG_LEVEL2) { > > + if (first) > > + verbose(env, "frame %d: propagating r%d", fr, i); > > + else > > + verbose(env, ",r%d", i); > > + } > > + bt_set_frame_reg(&env->bt, fr, i); > > + first = false; > > } > > > > for (i = 0; i < state->allocated_stack / BPF_REG_SIZE; i++) { > > @@ -15332,14 +15335,24 @@ static int propagate_precision(struct bpf_verifier_env *env, > > !state_reg->precise || > > !(state_reg->live & REG_LIVE_READ)) > > continue; > > - if (env->log.level & BPF_LOG_LEVEL2) > > - verbose(env, "frame %d: propagating fp%d\n", > > - fr, (-i - 1) * BPF_REG_SIZE); > > - err = mark_chain_precision_stack_frame(env, fr, i); > > - if (err < 0) > > - return err; > > + if (env->log.level & BPF_LOG_LEVEL2) { > > + if (first) > > + verbose(env, "frame %d: propagating fp%d", > > + fr, (-i - 1) * BPF_REG_SIZE); > > + else > > + verbose(env, ",fp%d", (-i - 1) * BPF_REG_SIZE); > > + } > > + bt_set_frame_slot(&env->bt, fr, i); > > + first = false; > > } > > + if (!first) > > + verbose(env, "\n"); > > } > > + > > + err = mark_chain_precision_batch(env, old->curframe); > > The optimization makes sense, sort-of, but 'first' flag is to separate frames on > different lines ? Yes, first is purely for formatting. Each frame will have a separate line, but all registers and stack frames for that frame will be on that single line, like so: frame 1: propagating r1,r2,r3,fp-8,fp-16 frame 0: propagating r3,r9,fp-120 > The code in propagate_precision() before mark_chain_precision_batch() > will only print '... propagating...' few times. > regs and stack will be on one line anyway. > Is it that ugly to print all frames on one line in practice? > The 2nd and 3rd frames will be on one line too. Only 1st frame is on its own line? > I'm not getting the idea of 'first'. So the above example shows how it looks now. Same situation without this formatting would look like: frame 1: propagating r1 frame 1: propagating r2 frame 1: propagating r3 frame 1: propagating fp-8 frame 1: propagating fp-16 frame 0: propagating r3 frame 0: propagating r9 frame 0: propagating fp-120 I wanted to have the same formatted set of registers and stack slots, so make it more succinct and printing entire sets of registers/stack slots, consistent with the backtrack log format.