Re: [PATCH bpf-next v1 7/9] bpf: Make per_cpu_ptr return rdonly PTR_TO_MEM.

Hao Luo <haoluo@xxxxxxxxxx> · Fri, 10 Dec 2021 10:36:46 -0800

On Fri, Dec 10, 2021 at 9:42 AM Andrii Nakryiko
<andrii.nakryiko@xxxxxxxxx> wrote:
>
> On Tue, Dec 7, 2021 at 7:54 PM Hao Luo <haoluo@xxxxxxxxxx> wrote:
> >
> > On Mon, Dec 6, 2021 at 10:18 PM Andrii Nakryiko
> > <andrii.nakryiko@xxxxxxxxx> wrote:
> > >
> > > On Mon, Dec 6, 2021 at 3:22 PM Hao Luo <haoluo@xxxxxxxxxx> wrote:
> > > >
> > > > Tag the return type of {per, this}_cpu_ptr with RDONLY_MEM. The
> > > > returned value of this pair of helpers is kernel object, which
> > > > can not be updated by bpf programs. Previously these two helpers
> > > > return PTR_OT_MEM for kernel objects of scalar type, which allows
> > > > one to directly modify the memory. Now with RDONLY_MEM tagging,
> > > > the verifier will reject programs that writes into RDONLY_MEM.
> > > >
> > > > Fixes: 63d9b80dcf2c ("bpf: Introduce bpf_this_cpu_ptr()")
>
> BTW, our tooling complained about this one because in reality the
> subject of the patch has a typo: "bpf: Introducte bpf_this_cpu_ptr()",
> please fix as well (that is, re-introduce the typo :) )
>

Ah, yes, thanks for the notice :). I do see that typo after sending
out this version. I have it fixed in my local repo already.

> > > > Fixes: eaa6bcb71ef6 ("bpf: Introduce bpf_per_cpu_ptr()")
> > > > Fixes: 4976b718c355 ("bpf: Introduce pseudo_btf_id")
> > > > Signed-off-by: Hao Luo <haoluo@xxxxxxxxxx>
> > > > ---
> > > >  kernel/bpf/helpers.c  |  4 ++--
> > > >  kernel/bpf/verifier.c | 33 ++++++++++++++++++++++++++++-----
> > > >  2 files changed, 30 insertions(+), 7 deletions(-)
> > > >
> > > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > > > index 293d9314ec7f..a5e349c9d3e3 100644
> > > > --- a/kernel/bpf/helpers.c
> > > > +++ b/kernel/bpf/helpers.c
> > > > @@ -667,7 +667,7 @@ BPF_CALL_2(bpf_per_cpu_ptr, const void *, ptr, u32, cpu)
> > > >  const struct bpf_func_proto bpf_per_cpu_ptr_proto = {
> > > >         .func           = bpf_per_cpu_ptr,
> > > >         .gpl_only       = false,
> > > > -       .ret_type       = RET_PTR_TO_MEM_OR_BTF_ID | PTR_MAYBE_NULL,
> > > > +       .ret_type       = RET_PTR_TO_MEM_OR_BTF_ID | PTR_MAYBE_NULL | MEM_RDONLY,
> > > >         .arg1_type      = ARG_PTR_TO_PERCPU_BTF_ID,
> > > >         .arg2_type      = ARG_ANYTHING,
> > > >  };
> > > > @@ -680,7 +680,7 @@ BPF_CALL_1(bpf_this_cpu_ptr, const void *, percpu_ptr)
> > > >  const struct bpf_func_proto bpf_this_cpu_ptr_proto = {
> > > >         .func           = bpf_this_cpu_ptr,
> > > >         .gpl_only       = false,
> > > > -       .ret_type       = RET_PTR_TO_MEM_OR_BTF_ID,
> > > > +       .ret_type       = RET_PTR_TO_MEM_OR_BTF_ID | MEM_RDONLY,
> > > >         .arg1_type      = ARG_PTR_TO_PERCPU_BTF_ID,
> > > >  };
> > > >
> > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > index f8b804918c35..44af65f07a82 100644
> > > > --- a/kernel/bpf/verifier.c
> > > > +++ b/kernel/bpf/verifier.c
> > > > @@ -4296,16 +4296,32 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
> > > >                                 mark_reg_unknown(env, regs, value_regno);
> > > >                         }
> > > >                 }
> > > > -       } else if (reg->type == PTR_TO_MEM) {
> > > > +       } else if (base_type(reg->type) == PTR_TO_MEM) {
> > > > +               bool rdonly_mem = type_is_rdonly_mem(reg->type);
> > > > +
> > > > +               if (type_may_be_null(reg->type)) {
> > > > +                       verbose(env, "R%d invalid mem access '%s'\n", regno,
> > > > +                               reg_type_str(reg->type));
> > >
> > > see, here you'll get "invalid mem access 'ptr_to_mem'" while it's
> > > actually ptr_to_mem_or_null. Like verifier logs are not hard enough to
> > > follow, now they will be also misleading.
> > >
> >
> > I think formatting string inside reg_type_str() can have this problem
> > solved, preserving the previous behavior. I'll try that in v2.
> >
> > > > +                       return -EACCES;
> > > > +               }
> > > > +
> > > > +               if (t == BPF_WRITE && rdonly_mem) {
> > > > +                       verbose(env, "R%d cannot write into rdonly %s\n",
> > > > +                               regno, reg_type_str(reg->type));
> > > > +                       return -EACCES;
> > > > +               }
> > > > +
> > > >                 if (t == BPF_WRITE && value_regno >= 0 &&
> > > >                     is_pointer_value(env, value_regno)) {
> > > >                         verbose(env, "R%d leaks addr into mem\n", value_regno);
> > > >                         return -EACCES;
> > > >                 }
> > > > +
> > > >                 err = check_mem_region_access(env, regno, off, size,
> > > >                                               reg->mem_size, false);
> > > > -               if (!err && t == BPF_READ && value_regno >= 0)
> > > > -                       mark_reg_unknown(env, regs, value_regno);
> > > > +               if (!err && value_regno >= 0)
> > > > +                       if (t == BPF_READ || rdonly_mem)
> > >
> > > why two nested ifs for one condition?
> > >
> >
> > No particular reason. I think it helped me understand the logic
> > better. But I'm fine with combining them into one 'if'.
>
> Personally two nested ifs are way harder to follow as it implies that
> there is some other sub-condition, while in reality it's one longer
> condition.
>
>
> >
> > > > +                               mark_reg_unknown(env, regs, value_regno);
> > > >         } else if (reg->type == PTR_TO_CTX) {
> > > >                 enum bpf_reg_type reg_type = SCALAR_VALUE;
> > > >                 struct btf *btf = NULL;
> > >
> > > [...]