Re: [PATCH bpf V2 1/1] bpf: fix verification of indirect var-off stack access

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Mon, 4 Dec 2023 16:43:44 -0800

On Mon, Dec 4, 2023 at 4:38 PM Andrei Matei <andreimatei1@xxxxxxxxx> wrote:
>
> On Mon, Dec 4, 2023 at 6:59 PM Andrii Nakryiko
> <andrii.nakryiko@xxxxxxxxx> wrote:
> >
> > On Mon, Dec 4, 2023 at 3:28 PM Andrei Matei <andreimatei1@xxxxxxxxx> wrote:
> > >
> > > On Mon, Dec 4, 2023 at 5:05 PM Andrii Nakryiko
> > > <andrii.nakryiko@xxxxxxxxx> wrote:
> > > >
> > > > On Mon, Dec 4, 2023 at 11:52 AM Andrei Matei <andreimatei1@xxxxxxxxx> wrote:
> > > > >
> > > > > [...]
> > > > >
> > > > > > >
> > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > > > > index af2819d5c8ee..b646bdde09cd 100644
> > > > > > > --- a/kernel/bpf/verifier.c
> > > > > > > +++ b/kernel/bpf/verifier.c
> > > > > > > @@ -6816,10 +6816,9 @@ static int check_stack_access_within_bounds(
> > > > > > >                         return -EACCES;
> > > > > > >                 }
> > > > > > >                 min_off = reg->smin_value + off;
> > > > > > > +               max_off = reg->smax_value + off;
> > > > > > >                 if (access_size > 0)
> > > > > > > -                       max_off = reg->smax_value + off + access_size - 1;
> > > > > > > -               else
> > > > > > > -                       max_off = min_off;
> > > > > > > +                       max_off += access_size - 1;
> > > > > >
> > > > > > this special casing of access_size == 0 feels wrong (and I mean before
> > > > > > your patch as well).
> > > > > >
> > > > > > Looking at the code, we only really calculate max_off to check that we
> > > > > > don't go to a non-negative stack offset, e.g., r10+0 or r10+1 (and
> > > > > > beyond).
> > > > > >
> > > > > > So given that, I propose to calculate max_off as an exclusive bound,
> > > > > > and instead of doing a mostly useless check_stack_slot_within_bounds()
> > > > > > call for it, just check that max_off is <= 0.
> > > > > >
> > > > > > Something like this:
> > > > > >
> > > > > > min_off = reg->smin_value + off;
> > > > > > max_off = reg->smax_value + off + access_size;
> > > > > > err = check_stack_slot_within_bounds(min_off, state, type);
> > > > > > if (!err && max_off > 0)
> > > > > >     err = -EINVAL; /* out of stack access into non-negative offsets */
> > > > >
> > > > > Dealing with access_size == 0 indeed feels dubious to me, but I'm not entirely
> > > > > sure that your suggested code is better. min_off being inclusive and
> > > > > max_off being
> > > > > exclusive seems surprising. I'll do it if you want, I don't care too much.
> > > > > We could keep max_off exclusive, and still not call
> > > > > check_stack_slot_within_bounds() for it:
> > > > >
> > > > >  min_off = reg->smin_value + off;
> > > > >  max_off = reg->smax_value + off + access_size - 1;
> > > > >  err = check_stack_slot_within_bounds(min_off, state, type);
> > > > >  if (!err && max_off >= 0)
> > > > >      err = -EINVAL; /* out of stack access into non-negative offsets */
> > > > >
> > > >
> > > > Yeah, we can do that. The reason I go for max_off being exclusive is
> > > > because using half-opened ranges is very convenient [start, end) (end
> > > > exclusive) is much more uniform and natural to handle compared to
> > > > closed [start, end] (end inclusive), in all sorts of checks, including
> > > > handling empty ranges. The math just works out better and more
> > > > naturally. And it's not like this will be the first time where in BPF
> > > > we have half-open ranges.
> > >
> > > Yeah, after hitting send, I was also thinking that half-open is the more common
> > > interval representation; it just wasn't how this code right here was written.
> > > Will do.
> > >
> > > >
> > > > > But now max_off can be below min_off, which again seems confusing.
> > > >
> > > > That's ok, the point here is to validate that we don't access stack
> > > > out of bounds.
> > > >
> > > > >
> > > > > What I'd really like to know is whether this whole zero access_size business
> > > > > deserves to exist. Do you know what the point of verifying a zero-sized access
> > > > > is exactly / could we turn 0-byte access into 1-byte accesses and
> > > > > verify that instead?
> > > > > Because then there'd be no more special case to consider.
> > > > >
> > > >
> > > >
> > > > I think zero is a natural case that can come up, at least with
> > > > helpers. As we have ARG_CONST_SIZE_OR_ZERO. So yeah, I wouldn't treat
> > > > zero-sized access as 1-byte access, that seems to be more confusing
> > > > and potentially broken.
> > >
> > > Ack. Still, if you don't mind entertaining me further, two more questions:
> > >
> > > 1. What do you make of the code in check_mem_size_reg() [1] where we do
> > >
> > > if (reg->umin_value == 0) {
> > >   err = check_helper_mem_access(env, regno - 1, 0,
> > >         zero_size_allowed,
> > >         meta);
> > >
> > > followed by
> > >
> > > err = check_helper_mem_access(env, regno - 1,
> > >       reg->umax_value,
> > >       zero_size_allowed, meta);
> > >
> > > [1] https://github.com/torvalds/linux/blob/bee0e7762ad2c6025b9f5245c040fcc36ef2bde8/kernel/bpf/verifier.c#L7486-L7489
> > >
> > > What's the point of the first check_helper_mem_access() call - the
> > > zero-sized one
> > > (given that we also have the second, broader, check)? Could it be
> > > simply replaced by a
> > >
> > > if (reg->umin_value == 0 && !zero_sized_allowed)
> > >     err = no_bueno;
> > >
> >
> > Maybe Kumar (cc'ed) can chime in as well, but I suspect that's exactly
> > this, and kind of similar to the min_off/max_off discussion we had. So
> > yes, I suspect the above simple and straightforward check would be
> > much more meaningful and targeted.
> >
> > I gotta say that the reg->smin_value < 0 check is confusing, though,
> > I'm not sure why we are mixing smin and umin/umax in this change...
> >
> > > ?
> > >
> > > 2. I believe you're saying that, if we were to verify zero-sized
> > > accesses as 1-byte-sized accesses, we
> > > might refuse some accesses that we permit today, and that wouldn't be
> > > good. But what about
> > > permitting zero-sized accesses with no further checks - i.e.
> > > considering *any* pointer value to
> > > be ok when the access_size == 0 ? Would that be bad? The question is,
> > > morally, what checks are
> > > important (if any) when the size of access is zero?
> > > Or to phrase another way - when a helper is called with a zero access
> > > size, do we expect the helper
> > > to do anything with that pointer, or do we expect the helper to be a no-op?
> >
> > Helper itself might not be a no-op, but it should not write back to
> > that pointer for sure. But I'd hate to have more special casing for
> > zero-size read/write than necessary. So if we can structure the logic
> > in a way that zero is a natural extension, I'd do that.
>
> Well but the thing is, the way I see it, we *currently* have a lot of
> special casing for
> zero access_size - we carry this zero_sized_allowed argument to a
> bunch of places.
> So I was thinking that maybe we could get rid of all that by terminating
> the verification of zero sized access in check_helper_mem_access() --
> if access_size == 0, either return an error if !zero_sized_allowed,
> otherwise return
> success with no further verification.
>

Maybe, but let's do it one step at a time. Let's fix the current
issue, supporting max_off with zero seems easy, let's do that for now?
We can have a separate patch/patch set to simplify zero size
arguments.

> >
> > >
> > > Thank you!
> > >
> > >
> > > >
> > > > > >
> > > > > >
> > > > > > Now, one more issue that jumped out at me is that we calculate min/max
> > > > > > off as a sum of smin/smax values (which are checked to be within
> > > > > > +/-1<<29, all good so far) *and* insn->off, which can be a full s32,
> > > > > > it seems. So we are running into overflow/underflow territory with
> > > > > > using int for min_off/max_off.
> > > > > >
> > > > > > While you are at it, can you please use s64 for all these calculations? Thanks!
> > > > > >
> > > > > >
> > > > > > >         }
> > > > > > >
> > > > > > >         err = check_stack_slot_within_bounds(min_off, state, type);
> > > > >
> > > > > Will do.