Re: [PATCH bpf v1 0/5] More fixes for crashes due to bad PTR_TO_BTF_ID reg->off

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Mon, 21 Feb 2022 12:45:37 -0800

On Sat, Feb 19, 2022 at 6:59 PM Kumar Kartikeya Dwivedi
<memxor@xxxxxxxxx> wrote:
>
> On Sun, Feb 20, 2022 at 07:48:08AM IST, Alexei Starovoitov wrote:
> > On Sat, Feb 19, 2022 at 05:40:35PM +0530, Kumar Kartikeya Dwivedi wrote:
> > > On Sat, Feb 19, 2022 at 05:07:39PM IST, Kumar Kartikeya Dwivedi wrote:
> > > > A few more fixes for bad PTR_TO_BTF_ID reg->off being accepted in places, that
> > > > can lead to the kernel crashing. Noticed while making sure my own series for BTF
> > > > ID pointer in map won't allow stores for pointers with incorrect offsets.
> > > >
> > > > I include one example where d_path can crash even if you NULL check
> > > > PTR_TO_BTF_ID, and one example of how missing NULL check in helper taking
> > > > PTR_TO_BTF_ID (like bpf_sock_from_file) is a real problem, see the selftest
> > > > patch.
> > > >
> > > > The &f->f_path becomes NULL + offset in case f is NULL, circumventing NULL
> > > > checks in existing helpers. The only thing needed to trigger this finding an
> > > > object that embeds the object of interest, and then somehow obtaining a NULL
> > > > PTR_TO_BTF_ID to it (not hard, esp. due to exception handler for PROBE_MEM loads
> > > > writing 0 to destination register).
> > > >
> > > > However, for the case of patch 2, it is allowed in my series since the next load
> > > > of the bad pointer stored using:
> > > >   struct file *f = ...; // some pointer walking returning NULL pointer
> > > >   map_val->ptr = &f->f_path; // ptr being struct path *
> > > > ... would be marked as PTR_UNTRUSTED, so it won't be allowed to be passed into
> > > > the kernel, and hence can be permitted. In referenced case, the PTR_TO_BTF_ID
> > > > should not be NULL anyway. kptr_get style helper takes PTR_TO_MAP_VALUE in
> > > > referenced ptr case only, so the load either yields NULL or RCU protected
> > > > pointer.
> > > >
> > > > Tests for patch 1 depend on fixup_kfunc_btf_id in test_verifier, hence will be
> > > > sent after merge window opens, some other changes after bpf tree merges into
> > > > bpf-next, but all pending ones can be seen here [0]. Tests for patch 2 are
> > > > included, and try to trigger crash without the fix, but it's not 100% reliable.
> > > > We may need special testing helpers or kfuncs to make it thorough, but wanted to
> > > > wait before getting feedback.
> > > >
> > > > Issue fixed by patch 2 is a bit more broader in scope, and would require proper
> > > > discussion (before being applied) on the correct way forward, as it is
> > > > technically backwards incompatible change, but hopefully never breaks real
> > > > programs, only malicious or already incorrect ones.
> > > >
> > > > Also, please suggest the right "Fixes" tag for patch 2.
> > > >
> > > > As for patch 3 (selftest), please suggest a better way to get a certain type of
> > > > PTR_TO_BTF_ID which can be NULL or NULL+offset. Can we add kfuncs for testing
> > > > that return such pointers and make them available to e.g. TC progs, if the fix
> > > > in patch 2 is acceptable?
> > > >
> > > >   [0]: https://github.com/kkdwivedi/linux/commits/fixes-bpf-next
> > > >
> > >
> > > Looking at BPF CI [1], it seems it surfaces another problem I was seeing locally
> > > but couldn't craft a reliable test case for, that it forms a non-NULL but
> > > invalid pointer using pointer walking, in some cases RCU read lock provides
> > > protection for those cases, but not all (esp. if kernel doesn't clear the old
> > > pointer that was in use before, and has it sitting in some location). RDI (arg1)
> > > seems to be pointing somewhere behind the faulting address, which means the
> > > &f->f_path is bad.
> > >
> > > But this requires a larger discussion.
> > >
> > > In that case, PAGE_SIZE thing won't help. We may have to introduce a PTR_BPF_REF
> > > flag (e.g. set for ctx args of tracing progs, set based on BTF tagging in other
> > > places) which tells the verifier that the pointer for the duration of program
> > > will be valid, so it can be passed into helpers.
> > >
> > > So for cases like &f->f_path it will work, since file + off will still have
> > > PTR_BPF_REF set in reg state. In case of pointer walking, where dst_reg state
> > > is updated on walk, we may have to explicitly tag members where PTR_BPF_REF can
> > > be inherited if parent object has PTR_BPF_REF (i.e. ref to parent implies ref to
> > > child cases).
> > >
> > > Something like:
> > >
> > > struct foo {
> > >     ...
> > >     struct bar __bpf_ref *p;
> > >     struct baz *q;
> > >     ...
> > > }
> > >
> > > ... then if getting:
> > >
> > >     struct foo *f = ...; // PTR_TO_BTF_ID | PTR_BPF_REF
> > >     struct bar *p = f->p; // Inherits PTR_BPF_REF
> > >     struct baz *q = f->q; // Does not inherit PTR_BPF_REF
> > >
> > > Thoughts?
> > >
> > >   [1]: https://github.com/kernel-patches/bpf/runs/5258413028?check_suite_focus=true
> >
> > fd_array wasn't zero initialized at alloc time, so it contains garbage.
> > fd_array[63] read that garbage.
> > So patches 2 and 3 don't help.
> > The 'fixes' list for patch 3 is ridiculous. No need.
> > Pls drop patch 2 and in instead of
> > +#define bpf_ptr_is_invalid(p) (unlikely((unsigned long)(p) < PAGE_SIZE))
> > do static inline function that checks
> > if ((unsigned long)p < user_addr_max())
>
> This prevents this specific case, but what happens when PTR_TO_BTF_ID can be
> formed to an already freed object (which seems even more likely to me in
> sleepable progs, because typical RCU grace period won't wait for us)? Or even
> just reading from structures which don't clear pointers they have freed, but are
> themselves not freed (so exception handling zeroing dst reg won't kick in).

Right. The above check would prevent a class of issues, but not all of them.
If kernel code doesn't clear the pointer inside a struct after freeing it
it would lead to uaf.
Thankfully most of the code is rcu and typical rcu protected pointer
usage would replace the pointer before doing call_rcu() to clean it.
There are corner cases and we need to gradually close such holes.
My point is that it's not a bpf tree material. No need to
create panic and do a patch with gadzillion 'fixes' tags.
With bpf progs calling kfuncs there is a possibility of a crash.
We need to eliminate such possibilities when we can.
But we're not going to backport hundred patches to old kernels.
We will close one hole today and a year later another hole could be found.
We'll close it too, but we are not going to backport a year worth of
verifier patches.

> > and bails out.
> > bpf-next is fine.
> > Not sure whether patch 1 is strictly necessary after above change.
>
> It is still needed to prevent the var_off case.

Right. var_off should be prevented.