Re: [PATCH v13 bpf-next 09/10] bpf: Add bpf_dynptr_slice and bpf_dynptr_slice_rdwr

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Mon, 6 Mar 2023 18:23:25 -0800

On Sun, Mar 5, 2023 at 11:10 PM Kumar Kartikeya Dwivedi
<memxor@xxxxxxxxx> wrote:
>
> On Wed, Mar 01, 2023 at 04:49:52PM CET, Joanne Koong wrote:
> > Two new kfuncs are added, bpf_dynptr_slice and bpf_dynptr_slice_rdwr.
> > The user must pass in a buffer to store the contents of the data slice
> > if a direct pointer to the data cannot be obtained.
> >
> > For skb and xdp type dynptrs, these two APIs are the only way to obtain
> > a data slice. However, for other types of dynptrs, there is no
> > difference between bpf_dynptr_slice(_rdwr) and bpf_dynptr_data.
> >
> > For skb type dynptrs, the data is copied into the user provided buffer
> > if any of the data is not in the linear portion of the skb. For xdp type
> > dynptrs, the data is copied into the user provided buffer if the data is
> > between xdp frags.
> >
> > If the skb is cloned and a call to bpf_dynptr_data_rdwr is made, then
> > the skb will be uncloned (see bpf_unclone_prologue()).
> >
> > Please note that any bpf_dynptr_write() automatically invalidates any prior
> > data slices of the skb dynptr. This is because the skb may be cloned or
> > may need to pull its paged buffer into the head. As such, any
> > bpf_dynptr_write() will automatically have its prior data slices
> > invalidated, even if the write is to data in the skb head of an uncloned
> > skb. Please note as well that any other helper calls that change the
> > underlying packet buffer (eg bpf_skb_pull_data()) invalidates any data
> > slices of the skb dynptr as well, for the same reasons.
> >
> > Signed-off-by: Joanne Koong <joannelkoong@xxxxxxxxx>
> > ---
>
> Sorry for chiming in late.
>
> I see one potential hole in bpf_dynptr_slice_rdwr. If the returned pointer is
> actually pointing to the stack (but verified as a PTR_TO_MEM in verifier state),
> we won't reflect changes to the stack state in the verifier for writes happening
> through it.
>
> For the worst case scenario, this will basically allow overwriting values of
> spilled pointers and doing arbitrary kernel memory reads/writes. This is only an
> issue when bpf_dynptr_slice_rdwr at runtime returns a pointer to the supplied
> buffer residing on program stack. To verify, by forcing the memcpy to buffer for
> skb_header_pointer I was able to make it dereference a garbage value for
> l4lb_all selftest.
>
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -2253,7 +2253,13 @@ __bpf_kfunc void *bpf_dynptr_slice(const struct bpf_dynptr_kern *ptr, u32 offset
>         case BPF_DYNPTR_TYPE_RINGBUF:
>                 return ptr->data + ptr->offset + offset;
>         case BPF_DYNPTR_TYPE_SKB:
> -               return skb_header_pointer(ptr->data, ptr->offset + offset, len, buffer);
> +       {
> +               void *p = skb_header_pointer(ptr->data, ptr->offset + offset, len, buffer);
> +               if (p == buffer)
> +                       return p;
> +               memcpy(buffer, p, len);
> +               return buffer;
> +       }
>
> --- a/tools/testing/selftests/bpf/progs/test_l4lb_noinline_dynptr.c
> +++ b/tools/testing/selftests/bpf/progs/test_l4lb_noinline_dynptr.c
> @@ -470,7 +470,10 @@ int balancer_ingress(struct __sk_buff *ctx)
>         eth = bpf_dynptr_slice_rdwr(&ptr, 0, buffer, sizeof(buffer));
>         if (!eth)
>                 return TC_ACT_SHOT;
> -       eth_proto = eth->eth_proto;
> +       *(void **)buffer = ctx;

Great catch.
To fix the issue I think we should simply disallow such
stack abuse. The compiler won't be spilling registers
into C array on the stack.
This manual spill/fill is exploiting verifier logic.
After bpf_dynptr_slice_rdwr() we can mark all slots of the
buffer as STACK_POISON or some better name and
reject spill into such slots.

> +       *(void **)eth = (void *)0xdeadbeef;
> +       ctx = *(void **)buffer;
> +       eth_proto = eth->eth_proto + ctx->len;
>         if (eth_proto == bpf_htons(ETH_P_IP))
>                 err = process_packet(&ptr, eth, nh_off, false, ctx);
>
> I think the proper fix is to treat it as a separate return type distinct from
> PTR_TO_MEM like PTR_TO_MEM_OR_PKT (or handle PTR_TO_MEM | DYNPTR_* specially),
> fork verifier state whenever there is a write, so that one path verifies it as
> PTR_TO_PACKET, while another as PTR_TO_STACK (if buffer was a stack ptr). I
> think for the rest it's not a problem, but there are allow_ptr_leak checks
> applied to PTR_TO_STACK and PTR_TO_MAP_VALUE, so that needs to be rechecked.
> Then we ensure that program is safe in either path.
>
> Also we need to fix regsafe to not consider other PTR_TO_MEMs equivalent to such
> a pointer. We could also fork verifier states on return, to verify either path
> separately right from the point following the call instruction.

This is too complex imo.