Re: [PATCH bpf V2 1/1] bpf: fix verification of indirect var-off stack access

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Mon, 11 Dec 2023 11:21:22 -0800

On Sun, Dec 10, 2023 at 5:24 PM Andrei Matei <andreimatei1@xxxxxxxxx> wrote:
>
> On Sun, Dec 10, 2023 at 6:13 PM Kumar Kartikeya Dwivedi
> <memxor@xxxxxxxxx> wrote:
> >
> > On Sun, 10 Dec 2023 at 23:46, Andrei Matei <andreimatei1@xxxxxxxxx> wrote:
> > >
> > > On Fri, Dec 8, 2023 at 9:46 PM Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote:
> > > >
> > > > On Fri, 8 Dec 2023 at 23:15, Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote:
> > > > >
> > > > > On Thu, Dec 7, 2023 at 7:23 PM Andrei Matei <andreimatei1@xxxxxxxxx> wrote:
> > > > > >
> > > > > > [...]
> > > > > >
> > > > > > > >
> > > > > > > > Ack. Still, if you don't mind entertaining me further, two more questions:
> > > > > > > >
> > > > > > > > 1. What do you make of the code in check_mem_size_reg() [1] where we do
> > > > > > > >
> > > > > > > > if (reg->umin_value == 0) {
> > > > > > > >   err = check_helper_mem_access(env, regno - 1, 0,
> > > > > > > >         zero_size_allowed,
> > > > > > > >         meta);
> > > > > > > >
> > > > > > > > followed by
> > > > > > > >
> > > > > > > > err = check_helper_mem_access(env, regno - 1,
> > > > > > > >       reg->umax_value,
> > > > > > > >       zero_size_allowed, meta);
> > > > > > > >
> > > > > > > > [1] https://github.com/torvalds/linux/blob/bee0e7762ad2c6025b9f5245c040fcc36ef2bde8/kernel/bpf/verifier.c#L7486-L7489
> > > > > > > >
> > > > > > > > What's the point of the first check_helper_mem_access() call - the
> > > > > > > > zero-sized one
> > > > > > > > (given that we also have the second, broader, check)? Could it be
> > > > > > > > simply replaced by a
> > > > > > > >
> > > > > > > > if (reg->umin_value == 0 && !zero_sized_allowed)
> > > > > > > >     err = no_bueno;
> > > > > > > >
> > > > > > >
> > > > > > > Maybe Kumar (cc'ed) can chime in as well, but I suspect that's exactly
> > > > > > > this, and kind of similar to the min_off/max_off discussion we had. So
> > > > > > > yes, I suspect the above simple and straightforward check would be
> > > > > > > much more meaningful and targeted.
> > > > > >
> > > > > > I plan on trying this in a bit; sounds like you're encouraging it.
> > > > > >
> > > > > > >
> > > > > > > I gotta say that the reg->smin_value < 0 check is confusing, though,
> > > > > > > I'm not sure why we are mixing smin and umin/umax in this change...
> > > > > >
> > > > > > When you say "in this change", you mean in the existing code, yeah?  I'm not
> > > > >
> > > > > Yeah, sorry, words are hard. It's clearly a question about pre-existing code.
> > > > >
> > > > > > familiar enough with the smin/umin tracking to tell if `reg->smin_value >= 0`
> > > > > > (the condition that the function tests first) implies that
> > > > > > `reg->smin_value == reg->umin_value` (i.e. the fact that we're currently mixing
> > > > >
> > > > > this is probably true most of the time, but knowing how tricky this
> > > > > range tracking is, there is non-zero chance that this is not always
> > > > > true. Which is why I'm a bit confused why we are freely intermixing
> > > > > signed/unsigned range in this code.
> > > > >
> > > > > > smin/umin in check_mem_size_reg() is confusing, but benign).  Is that true? If
> > > > > > so, are you saying that check_mem_size_reg() should exclusively use smin/smax?
> > > > > >
> > > > >
> > > > > I'd like to hear from Kumar on what was the intent before suggesting
> > > > > changing anything.
> > > >
> > > > So this was not originally from me, I just happened to move it around
> > > > when adding support for this to kfuncs into a shared helper (if you
> > > > look at the git blame), it's hard for me to comment on the original
> > > > intent, I would know as much as anyone else.
> > > >
> > > > But to helpful, I digged around a bit and found the original patch
> > > > adding this snippet:
> > > >
> > > > 06c1c049721a ("bpf: allow helpers access to variable memory")
> > > >
> > > > It seems the main reason to add that < 0 check on min value was to
> > > > tell the user in the specific case where a spilled value is reloaded
> > > > from stack that they need to mask it again using bitwise operations,
> > > > because back then a spilled constant when reloaded would become
> > > > unknown, and when passed as a parameter to a helper the program would
> > > > be rejected with a weird error trying to access size greater than the
> > > > user specified in C.
> > > >
> > > > Now this change predates the signed/unsigned distinction, that came in:
> > > >
> > > > b03c9f9fdc37 ("bpf/verifier: track signed and unsigned min/max values")
> > > >
> > > > That changes reg->min_value to reg->smin_value, the < 0 comparison
> > > > only makes sense for that.
> > > > Since then that part of the code has stayed the same.
> > > >
> > > > So I think it would probably be better to just use smin/smax as you
> > > > discussed above upthread, also since the BPF_MAX_VAR_SIZE is INT_MAX,
> > > > so it shouldn't be a problem.
> > >
> > > Thank you for spelunking, Kumar!
> > > There were two discussions upthread:
> > > 1) whether the 0-size check_helper_mem_access() call in
> > > check_mem_size_reg() can be drastically simplified
> > > 2) whether the mixed use of smin with umin/umax makes sense.
> > >
> > > It seems that we've come up empty-handed on a good reasoning
> > > for 1). I have a patch that simplifies it AND also improves
> > > error messages as a result. I'm inclined to send it for your
> > > consideration, Andrii, if that's cool, as you also didn't
> > > seem to like the current code.
> > >
> >
> > While that's true, I think it should probably go into
> > check_helper_mem_access instead of being duplicated for handlers of
> > each switch statement.
> > It seems one of them (for PTR_TO_MAP_KEY) hardcodes it regardless of
> > what's passed in, and there's a special case of register_is_null which
> > is permitted.
> >
> > So it might be better to unify the handling in check_helper_mem_access
> > instead of its callers. Just my $0.02.
>
> My initial focus is getting check_mem_size_reg() to not call
> check_helper_mem_access() twice. That's what patch [1] does.
>
> Then, I tend to agree with you that
> check_helper_mem_access() forwarding zero_size_allowed() to
> a bunch of switch arms seems unnecessary; it also bothered
> me. I did try to do something about it for a bit - terminate
> the handling of zero_sized_allowed somewhere - but the thing
> is that the utilities used in that switch (e.g.
> check_mem_region_access()) are also called in other places,
> with both true and false for zero_sized_allowed. So I didn't
> immediately come up with something better and gave up for
> now. But if you have throughts, let's take it to the new
> patch I'd say.
>
> [1] https://lore.kernel.org/bpf/20231210225536.70322-1-andreimatei1@xxxxxxxxx/
>
>
> >
> > > About 2) -- the current mixing of smin/umin/umax actually
> > > makes sense to me.  I'd rationalize the (smin < 0) check as
> > > "does this value *look* like a negative value? If so,
> > > opportunistically give the user a nice error message". Even
> > > if the value did not actually come from a signed variable,
> > > but instead came for a very large unsigned, the program
> > > shouldn't verify anyway, so it's no big deal if the negative
> > > interpretation is erroneous. After that check, the value is
> > > exclusively treated as unsigned, since the size argument for
> > > which it is intended is unsigned.
> > > So, I think you can argue either way for the combinations of
> > > signed/unsigned checks that could be done, but I personally
> > > am not inclined to change the current code.
> > >
> >
> > I think based on the thread it would be better to atleast comments
> > explaining all this, even if in the end we don't decide to touch it.
>
> Yeah... But comment what exactly? I could put a comment on
> the (smin_value < 0) check saying something "if the value
> looks negative, assume that it came from a signed variable
> and give a helpful error message", and imply that the value
> should be treated as unsigned from that moment on. But then
> it gets confusing when, a few lines down,
> check_helper_mem_access() takes the size as `int` instead of
> `u32`. So the truth I'm not entirely sure what to say, plus
> Andrii might have other ideas about how the bike shed should
> be colored. If we build more consensus though, I'm all about
> adding comments.

I'm a bit too lazy to go figure this out right now, but my general
thinking regarding smin/umin checks is that I wouldn't mix them, it's
super confusing. Then the question of smin/smax vs umin/umax comes
down to how operation/instruction is interpreting the value of the
register. If it's treated as signed value, then consistently use
smin/smax for all the checks (umin/umax are irrelevant), if it's
treated as unsigned, then do just umin/umax.

I don't know what this specific logic uses, signed or unsigned, but we
should check and make it consistent.

If we want to be "helpful" in detecting potential situations with
under/overflow, then we can express that within the same numeric
domain with appropriate range checks.

Final rant. If we can avoid assuming relationships between umin/umax
and smin/smax ranges, we should. It's a tricky part of verifier code
(though which one isn't, right?), so the fewer assumptions - the
better.

>
>
> >
> > >
> > > [...]