Re: [PATCH 1/5] bpf: Clear callee saved regs after updating REG0

Joanne Koong <joannelkoong@xxxxxxxxx> · Mon, 8 Aug 2022 16:32:39 -0700

On Mon, Aug 8, 2022 at 11:50 AM David Vernet <void@xxxxxxxxxxxxx> wrote:
>
> On Mon, Aug 08, 2022 at 11:14:48AM -0700, Joanne Koong wrote:
> > On Mon, Aug 8, 2022 at 8:53 AM David Vernet <void@xxxxxxxxxxxxx> wrote:
> > >
> > > In the verifier, we currently reset all of the registers containing caller
> > > saved args before updating the callee's return register (REG0). In a
> > > follow-on patch, we will need to be able to be able to inspect the caller
> > > saved registers when updating REG0 to determine if a dynptr that's passed
> > > to a helper function was allocated by a helper, or allocated by a program.
> > >
> > > This patch therefore updates check_helper_call() to clear the caller saved
> > > regs after updating REG0.
> > >
> > Overall lgtm
>
> Thanks for the quick review!
>
> > There's a patch [0] that finds + stores the ref obj id before the
> > caller saved regs get reset, which would make this patch not needed.
>
> Interesting. Indeed, that would solve this issue, and I'm fine with that
> approach as well, if not preferential to it.
>
> > That hasn't been merged in yet though and I think there's pros for
> > either approach.
> >
> > In the one where we find + store the ref obj id before any caller
> > saved regs get reset, the pro is that getting the dynptr metadata (eg
> > ref obj id and in the near future, the dynptr type as well) earlier
> > will be useful (eg when we add skb/xdp dynptrs [1], we'll need to know
> > the type of the dynptr in order to determine whether to set the return
> > reg as PTR_TO_PACKET). In this patch, the pro is that the logic is a
> > lot more obvious to readers that the ref obj id for the dynptr gets
> > found and set in order to store it in the return reg's ref obj id.
> >
> > I personally lean more towards the approach in [0] because I think
> > that ends up being cleaner for future extensibility, but I don't feel
> > strongly about it and would be happy going with this approach as well
>
> So, I think regardless of whether this gets merged, [0] is probably worth
> merging as I agree that it simplifies the current logic for setting the ref
> obj id and is a purely positive change on its own.
>
> When I was originally typing my response to your email, I was wondering
> whether it would be useful to keep the state in the caller saved registers
> for the logic in 7360 - 7489 in general for the future even if [0] is
> applied. It's probably simpler, however, to keep what we have now and just
> reset all of the registers. So if [0] gets merged, I can just remove this
> patch from the set.

sounds great!

>
> > [0] https://lore.kernel.org/bpf/20220722175807.4038317-1-joannelkoong@xxxxxxxxx/#t
> >
> > [1] https://lore.kernel.org/bpf/20220726184706.954822-1-joannelkoong@xxxxxxxxx/T/#t
> >
> > > Signed-off-by: David Vernet <void@xxxxxxxxxxxxx>
> > > ---
> > >  kernel/bpf/verifier.c | 15 ++++++++++-----
> > >  1 file changed, 10 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > index 096fdac70165..938ba1536249 100644
> > > --- a/kernel/bpf/verifier.c
> > > +++ b/kernel/bpf/verifier.c
> > > @@ -7348,11 +7348,9 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
> > >         if (err)
> > >                 return err;
> > >
> > > -       /* reset caller saved regs */
> > > -       for (i = 0; i < CALLER_SAVED_REGS; i++) {
> > > -               mark_reg_not_init(env, regs, caller_saved[i]);
> > > -               check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK);
> > > -       }
> > > +       /* reset return reg */
> > > +       mark_reg_not_init(env, regs, BPF_REG_0);
> > > +       check_reg_arg(env, BPF_REG_0, DST_OP_NO_MARK);
> > >
> > >         /* helper call returns 64-bit value. */
> > >         regs[BPF_REG_0].subreg_def = DEF_NOT_SUBREG;
> > > @@ -7488,6 +7486,13 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
> > >                 regs[BPF_REG_0].ref_obj_id = dynptr_id;
> > >         }
> > >
> > > +       /* reset remaining caller saved regs */
> > > +       BUILD_BUG_ON(caller_saved[0] != BPF_REG_0);
> >
> > nit: caller_saved is a read-only const, so I don't think this line is needed
>
> It being a read-only const is was why I made this a BUILD_BUG_ON. My
> intention here was to ensure that we're not accidentally skipping the
> resetting of caller_saved[0]. The original code iterated from
> caller_saved[0] -> caller_saved[CALLER_SAVED_REGS - 1]. Now that we're
> starting from caller_saved[1], this compile-time assertion verifies that
> we're not accidentally skipping caller_saved[0] by checking that it's the
> same as BPF_REG_0, which is reset above. Does that make sense?

I think it's an invariant that r0 - r5 are the caller saved args and
that caller_saved[0] will always be BPF_REG_0. I'm having a hard time
seeing a case where this would change in the future, but then again, I
am also not a fortune teller so maybe I am wrong here :) I don't think
it's a big deal though so I don't feel strongly about this

>
> > > +       for (i = 1; i < CALLER_SAVED_REGS; i++) {
> >
> > nit: maybe "for i = BPF_REG_1" ?
>
> Good suggestion, will apply in the v2 (if there is one and we don't decide
> to just go with [0] :-))
>
> Thanks,
> David