Re: [PATCH bpf] bpf: Fix crash due to OOB access when reg->type > __BPF_REG_TYPE_MAX

Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> · Thu, 17 Feb 2022 02:28:42 +0530

On Thu, Feb 17, 2022 at 01:19:56AM IST, Alexei Starovoitov wrote:
> On Wed, Feb 16, 2022 at 11:59:54PM +0530, Kumar Kartikeya Dwivedi wrote:
> > > > > Just use reg2btf_ids[base_type(reg->type)] instead?
> > > >
> > > > That would be incorrect I think, then we'd allow e.g. PTR_TO_TCP_SOCK_OR_NULL
> > > > and treat it as PTR_TO_TCP_SOCK, while current code only wants to permit
> > > > non-NULL variants for these three.
> > >
> > > add && !type_flag(reg->type) ?
> > >
> >
> > Ok, we can do that. WRT Hao's suggestion, we do allow NULL for ptr_to_mem_ok
> > case, so doing it for all of check_func_arg_match won't work.
>
> Right.
>
> > > But, first, please describe how you found it.
> > > Tried to pass PTR_TO_BTF_ID_OR_NULL into kfunc?
> > > Do you have some other changes in the verifier?
> >
> > Yes, was working on [0], tried to come up with a test case where verifier
> > printed bad register type being passed (one marked with a new flag), but noticed
> > that it would fall out of __BPF_REG_TYPE_MAX limit during kfunc check. Also, it
> > seems on non-KASAN build it actually doesn't crash sometimes, depends on what
> > the value at that offset is.
> >
> >   [0]: https://github.com/kkdwivedi/linux/commits/btf-ptr-in-map
>
> Nice. Thanks a ton. I did a quick look. Seems to be ready to post on the list?
> Can you add .c tests and post?
>

Thanks for the quick review!

Yes, I should be able to post by this weekend if I don't run into any other
problems/bugs in my code.

> Only one suggestion so far:
> Could you get rid of callback in btf_find_struct_field ?
> The callbacks obscure the control flow and make the code harder to follow.
> Do refactor btf_find_struct_field, but call btf_find_ptr_to_btf_id directly from it.
> Then you wouldn't need this ugly part:
> /* While callback cleans up after itself, later iterations can still
>  * return error without calling cb, so call free function again.
>  */
> if (ret < 0)
>   bpf_map_free_ptr_off_tab(map);
>

Ok, makes sense, will change.

> I've been thinking about the api for refcnted PTR_TO_BTF_ID in the map too.
> Regarding your issue with cmpxchg. I've come up with two helpers:
>
> BPF_CALL_3(bpf_kptr_try_set, void **, kptr, void *, ptr, int, refcnt_off)
> {
>         /* ptr is ptr_to_btf_id returned from bpf_*_lookup() with ptr->refcnt >= 1
>          * refcount_inc() has to be done before cmpxchg() because
>          * another cpu might do bpf_kptr_xchg+release.
>          */
>         refcount_inc((refcount_t *)(ptr + refcnt_off));
>         if (cmpxchg(kptr, NULL, ptr)) {
>                 /* refcnt >= 2 here */
>                 refcount_dec((refcount_t *)(ptr + refcnt_off));
>                 return -EBUSY;
>         }
>         return 0;
> }
>
> and
>
> BPF_CALL_2(bpf_kptr_get, void **, kptr, int, refcnt_off)
> {
>         void *ptr = READ_ONCE(kptr);
>
>         if (!ptr)
>                 return 0;
>         /* ptr->refcnt could be == 0 if another cpu did
>          * ptr2 = bpf_kptr_xchg();
>          * bpf_*_release(ptr2);
>          */
>         if (!refcount_inc_not_zero((refcount_t *)(ptr + refcnt_off)))
>                 return 0;
>         return (long) ptr;
> }
>

Good point. I did think about this problem. Right now when you load a referenced
pointer from the BPF map, we mark it as PTR_UNTRUSTED, so you can dereference
but not pass into a helper like this, so my thinking was that BPF user will
already have an additional check (e.g. pointer being reachable from hash table
key) before accessing it, and that the xchg can happen after the other program
has removed the value's visibility globally (by deleting it from table). And for
RCU objects dereference of PTR_UNTRUSTED still works and reads valid data, they
can also skip reading by observing ptr->refcnt == 0 for dying case.

So in case of fast path as long as you don't need to pass BTF ID into helpers,
you don't have to increment refcount.

As you note, this is not enough, e.g. we might want to pass a referenced pointer
into helpers without removing it from the map. In that case such bpf_kptr_get
helper makes sense, but I think 1) It needs to be inlined, because refcount_off
will only be available easily at verification time (from BTF), 2) We cannot
guarantee that it keeps working for same object across kernel versions, that
ties kernel into a stability guarantee reg. internal changes, 3) We can only
support it for subset of objects (that have RCU protection), which we can
ideally detect from BTF itself, by marking them as such, or using a flag during
destructor registration (which is done for ref'd pointer case).

>
> and instead of recognizing xchg insn directly in the verifier
> I went with the helper:
> BPF_CALL_2(bpf_kptr_xchg, void **, kptr, void *, ptr)
> {
>         /* ptr is ptr_to_btf_id returned from bpf_*_lookup() with ptr->refcnt >= 1
>          * or ptr == NULL.
>          * returns ptr_to_btf_id with refcnt >= 1 or NULL
>          */
>         return (long) xchg(kptr, ptr);
> }
> It was easier to deal with.
> I don't have a strong preference for helper vs insn though.
> Let's brainstorm after you post patches.
>
> xchg or bpf_kptr_xchg only operation seems to be too limiting.

Right, one other usecase me and Toke were exploring is using this support to
enable XDP queueing. We can have a normal PIFO map and allow embedding xdp_frame
in it. Then xchg is used during enqueue and dequeue from the map, and user can
have additional stuff in map value (like embedding both bpf_timer and xdp_frame,
or extra scalars as data), which should allow more composition/flexibility.

To also reduce the overhead of this xchg on fast path, we can consider relaxing
atomic restriction for per-CPU maps instead, then add a bpf_move_ptr helper that
is non-atomic, but ensures invariant that value in per-CPU map is either
refcounted or NULL, and only in program or map at any point of time. Then the
overhead of moving object ownership is also eliminated, and helper itself can be
inlined to a load from the map and a store to the map (of the new pointer or
NULL).