Re: [PATCH bpf-next 06/10] bpf: fix propagate_precision() logic for inner frames

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 04, 2023 at 01:57:15PM -0700, Andrii Nakryiko wrote:
> On Wed, May 3, 2023 at 8:14 PM Alexei Starovoitov
> <alexei.starovoitov@xxxxxxxxx> wrote:
> >
> > On Tue, Apr 25, 2023 at 04:49:07PM -0700, Andrii Nakryiko wrote:
> > > Fix propagate_precision() logic to perform propagation of all necessary
> > > registers and stack slots across all active frames *in one batch step*.
> > >
> > > Doing this for each register/slot in each individual frame is wasteful,
> > > but the main problem is that backtracking of instruction in any frame
> > > except the deepest one just doesn't work. This is due to backtracking
> > > logic relying on jump history, and available jump history always starts
> > > (or ends, depending how you view it) in current frame. So, if
> > > prog A (frame #0) called subprog B (frame #1) and we need to propagate
> > > precision of, say, register R6 (callee-saved) within frame #0, we
> > > actually don't even know where jump history that corresponds to prog
> > > A even starts. We'd need to skip subprog part of jump history first to
> > > be able to do this.
> > >
> > > Luckily, with struct backtrack_state and __mark_chain_precision()
> > > handling bitmasks tracking/propagation across all active frames at the
> > > same time (added in previous patch), propagate_precision() can be both
> > > fixed and sped up by setting all the necessary bits across all frames
> > > and then performing one __mark_chain_precision() pass. This makes it
> > > unnecessary to skip subprog parts of jump history.
> > >
> > > We also improve logging along the way, to clearly specify which
> > > registers' and slots' precision markings are propagated within which
> > > frame.
> > >
> > > Fixes: 529409ea92d5 ("bpf: propagate precision across all frames, not just the last one")
> > > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
> > > ---
> > >  kernel/bpf/verifier.c | 49 +++++++++++++++++++++++++++----------------
> > >  1 file changed, 31 insertions(+), 18 deletions(-)
> > >
> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > index 0b19b3d9af65..66d64ac10fb1 100644
> > > --- a/kernel/bpf/verifier.c
> > > +++ b/kernel/bpf/verifier.c
> > > @@ -3885,14 +3885,12 @@ int mark_chain_precision(struct bpf_verifier_env *env, int regno)
> > >       return __mark_chain_precision(env, env->cur_state->curframe, regno, -1);
> > >  }
> > >
> > > -static int mark_chain_precision_frame(struct bpf_verifier_env *env, int frame, int regno)
> > > +static int mark_chain_precision_batch(struct bpf_verifier_env *env, int frame)
> > >  {
> > > -     return __mark_chain_precision(env, frame, regno, -1);
> > > -}
> > > -
> > > -static int mark_chain_precision_stack_frame(struct bpf_verifier_env *env, int frame, int spi)
> > > -{
> > > -     return __mark_chain_precision(env, frame, -1, spi);
> > > +     /* we assume env->bt was set outside with desired reg and stack masks
> > > +      * for all frames
> > > +      */
> > > +     return __mark_chain_precision(env, frame, -1, -1);
> > >  }
> > >
> > >  static bool is_spillable_regtype(enum bpf_reg_type type)
> > > @@ -15308,20 +15306,25 @@ static int propagate_precision(struct bpf_verifier_env *env,
> > >       struct bpf_reg_state *state_reg;
> > >       struct bpf_func_state *state;
> > >       int i, err = 0, fr;
> > > +     bool first;
> > >
> > >       for (fr = old->curframe; fr >= 0; fr--) {
> > >               state = old->frame[fr];
> > >               state_reg = state->regs;
> > > +             first = true;
> > >               for (i = 0; i < BPF_REG_FP; i++, state_reg++) {
> > >                       if (state_reg->type != SCALAR_VALUE ||
> > >                           !state_reg->precise ||
> > >                           !(state_reg->live & REG_LIVE_READ))
> > >                               continue;
> > > -                     if (env->log.level & BPF_LOG_LEVEL2)
> > > -                             verbose(env, "frame %d: propagating r%d\n", fr, i);
> > > -                     err = mark_chain_precision_frame(env, fr, i);
> > > -                     if (err < 0)
> > > -                             return err;
> > > +                     if (env->log.level & BPF_LOG_LEVEL2) {
> > > +                             if (first)
> > > +                                     verbose(env, "frame %d: propagating r%d", fr, i);
> > > +                             else
> > > +                                     verbose(env, ",r%d", i);
> > > +                     }
> > > +                     bt_set_frame_reg(&env->bt, fr, i);
> > > +                     first = false;
> > >               }
> > >
> > >               for (i = 0; i < state->allocated_stack / BPF_REG_SIZE; i++) {
> > > @@ -15332,14 +15335,24 @@ static int propagate_precision(struct bpf_verifier_env *env,
> > >                           !state_reg->precise ||
> > >                           !(state_reg->live & REG_LIVE_READ))
> > >                               continue;
> > > -                     if (env->log.level & BPF_LOG_LEVEL2)
> > > -                             verbose(env, "frame %d: propagating fp%d\n",
> > > -                                     fr, (-i - 1) * BPF_REG_SIZE);
> > > -                     err = mark_chain_precision_stack_frame(env, fr, i);
> > > -                     if (err < 0)
> > > -                             return err;
> > > +                     if (env->log.level & BPF_LOG_LEVEL2) {
> > > +                             if (first)
> > > +                                     verbose(env, "frame %d: propagating fp%d",
> > > +                                             fr, (-i - 1) * BPF_REG_SIZE);
> > > +                             else
> > > +                                     verbose(env, ",fp%d", (-i - 1) * BPF_REG_SIZE);
> > > +                     }
> > > +                     bt_set_frame_slot(&env->bt, fr, i);
> > > +                     first = false;
> > >               }
> > > +             if (!first)
> > > +                     verbose(env, "\n");
> > >       }
> > > +
> > > +     err = mark_chain_precision_batch(env, old->curframe);
> >
> > The optimization makes sense, sort-of, but 'first' flag is to separate frames on
> > different lines ?
> 
> Yes, first is purely for formatting. Each frame will have a separate
> line, but all registers and stack frames for that frame will be on
> that single line, like so:
> 
> frame 1: propagating r1,r2,r3,fp-8,fp-16
> frame 0: propagating r3,r9,fp-120

I see. The example of the output helped a lot.
Now the code makes sense. Pls include it in a commit or a comment.




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux