Re: [PATCH bpf-next 06/10] bpf: fix propagate_precision() logic for inner frames

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 3, 2023 at 8:14 PM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Tue, Apr 25, 2023 at 04:49:07PM -0700, Andrii Nakryiko wrote:
> > Fix propagate_precision() logic to perform propagation of all necessary
> > registers and stack slots across all active frames *in one batch step*.
> >
> > Doing this for each register/slot in each individual frame is wasteful,
> > but the main problem is that backtracking of instruction in any frame
> > except the deepest one just doesn't work. This is due to backtracking
> > logic relying on jump history, and available jump history always starts
> > (or ends, depending how you view it) in current frame. So, if
> > prog A (frame #0) called subprog B (frame #1) and we need to propagate
> > precision of, say, register R6 (callee-saved) within frame #0, we
> > actually don't even know where jump history that corresponds to prog
> > A even starts. We'd need to skip subprog part of jump history first to
> > be able to do this.
> >
> > Luckily, with struct backtrack_state and __mark_chain_precision()
> > handling bitmasks tracking/propagation across all active frames at the
> > same time (added in previous patch), propagate_precision() can be both
> > fixed and sped up by setting all the necessary bits across all frames
> > and then performing one __mark_chain_precision() pass. This makes it
> > unnecessary to skip subprog parts of jump history.
> >
> > We also improve logging along the way, to clearly specify which
> > registers' and slots' precision markings are propagated within which
> > frame.
> >
> > Fixes: 529409ea92d5 ("bpf: propagate precision across all frames, not just the last one")
> > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
> > ---
> >  kernel/bpf/verifier.c | 49 +++++++++++++++++++++++++++----------------
> >  1 file changed, 31 insertions(+), 18 deletions(-)
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 0b19b3d9af65..66d64ac10fb1 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -3885,14 +3885,12 @@ int mark_chain_precision(struct bpf_verifier_env *env, int regno)
> >       return __mark_chain_precision(env, env->cur_state->curframe, regno, -1);
> >  }
> >
> > -static int mark_chain_precision_frame(struct bpf_verifier_env *env, int frame, int regno)
> > +static int mark_chain_precision_batch(struct bpf_verifier_env *env, int frame)
> >  {
> > -     return __mark_chain_precision(env, frame, regno, -1);
> > -}
> > -
> > -static int mark_chain_precision_stack_frame(struct bpf_verifier_env *env, int frame, int spi)
> > -{
> > -     return __mark_chain_precision(env, frame, -1, spi);
> > +     /* we assume env->bt was set outside with desired reg and stack masks
> > +      * for all frames
> > +      */
> > +     return __mark_chain_precision(env, frame, -1, -1);
> >  }
> >
> >  static bool is_spillable_regtype(enum bpf_reg_type type)
> > @@ -15308,20 +15306,25 @@ static int propagate_precision(struct bpf_verifier_env *env,
> >       struct bpf_reg_state *state_reg;
> >       struct bpf_func_state *state;
> >       int i, err = 0, fr;
> > +     bool first;
> >
> >       for (fr = old->curframe; fr >= 0; fr--) {
> >               state = old->frame[fr];
> >               state_reg = state->regs;
> > +             first = true;
> >               for (i = 0; i < BPF_REG_FP; i++, state_reg++) {
> >                       if (state_reg->type != SCALAR_VALUE ||
> >                           !state_reg->precise ||
> >                           !(state_reg->live & REG_LIVE_READ))
> >                               continue;
> > -                     if (env->log.level & BPF_LOG_LEVEL2)
> > -                             verbose(env, "frame %d: propagating r%d\n", fr, i);
> > -                     err = mark_chain_precision_frame(env, fr, i);
> > -                     if (err < 0)
> > -                             return err;
> > +                     if (env->log.level & BPF_LOG_LEVEL2) {
> > +                             if (first)
> > +                                     verbose(env, "frame %d: propagating r%d", fr, i);
> > +                             else
> > +                                     verbose(env, ",r%d", i);
> > +                     }
> > +                     bt_set_frame_reg(&env->bt, fr, i);
> > +                     first = false;
> >               }
> >
> >               for (i = 0; i < state->allocated_stack / BPF_REG_SIZE; i++) {
> > @@ -15332,14 +15335,24 @@ static int propagate_precision(struct bpf_verifier_env *env,
> >                           !state_reg->precise ||
> >                           !(state_reg->live & REG_LIVE_READ))
> >                               continue;
> > -                     if (env->log.level & BPF_LOG_LEVEL2)
> > -                             verbose(env, "frame %d: propagating fp%d\n",
> > -                                     fr, (-i - 1) * BPF_REG_SIZE);
> > -                     err = mark_chain_precision_stack_frame(env, fr, i);
> > -                     if (err < 0)
> > -                             return err;
> > +                     if (env->log.level & BPF_LOG_LEVEL2) {
> > +                             if (first)
> > +                                     verbose(env, "frame %d: propagating fp%d",
> > +                                             fr, (-i - 1) * BPF_REG_SIZE);
> > +                             else
> > +                                     verbose(env, ",fp%d", (-i - 1) * BPF_REG_SIZE);
> > +                     }
> > +                     bt_set_frame_slot(&env->bt, fr, i);
> > +                     first = false;
> >               }
> > +             if (!first)
> > +                     verbose(env, "\n");
> >       }
> > +
> > +     err = mark_chain_precision_batch(env, old->curframe);
>
> The optimization makes sense, sort-of, but 'first' flag is to separate frames on
> different lines ?

Yes, first is purely for formatting. Each frame will have a separate
line, but all registers and stack frames for that frame will be on
that single line, like so:

frame 1: propagating r1,r2,r3,fp-8,fp-16
frame 0: propagating r3,r9,fp-120

> The code in propagate_precision() before mark_chain_precision_batch()
> will only print '... propagating...' few times.
> regs and stack will be on one line anyway.
> Is it that ugly to print all frames on one line in practice?
> The 2nd and 3rd frames will be on one line too. Only 1st frame is on its own line?
> I'm not getting the idea of 'first'.

So the above example shows how it looks now. Same situation without
this formatting would look like:

frame 1: propagating r1
frame 1: propagating r2
frame 1: propagating r3
frame 1: propagating fp-8
frame 1: propagating fp-16
frame 0: propagating r3
frame 0: propagating r9
frame 0: propagating fp-120

I wanted to have the same formatted set of registers and stack slots,
so make it more succinct and printing entire sets of registers/stack
slots, consistent with the backtrack log format.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux