Re: [PATCH bpf-next v1 3/7] bpf: Add bpf_dynptr_from_mem, bpf_malloc, bpf_free

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 8, 2022 at 4:37 PM Joanne Koong <joannelkoong@xxxxxxxxx> wrote:
>
> On Fri, Apr 8, 2022 at 3:46 PM Andrii Nakryiko
> <andrii.nakryiko@xxxxxxxxx> wrote:
> >
> > On Fri, Apr 8, 2022 at 3:04 PM Joanne Koong <joannelkoong@xxxxxxxxx> wrote:
> > >
> > > On Wed, Apr 6, 2022 at 3:23 PM Andrii Nakryiko
> > > <andrii.nakryiko@xxxxxxxxx> wrote:
> > > >
> > > > On Fri, Apr 1, 2022 at 7:00 PM Joanne Koong <joannekoong@xxxxxx> wrote:
> > > > >
> > > > > From: Joanne Koong <joannelkoong@xxxxxxxxx>
> > > > >
> > > > > This patch adds 3 new APIs and the bulk of the verifier work for
> > > > > supporting dynamic pointers in bpf.
> > > > >
> > > > > There are different types of dynptrs. This patch starts with the most
> > > > > basic ones, ones that reference a program's local memory
> > > > > (eg a stack variable) and ones that reference memory that is dynamically
> > > > > allocated on behalf of the program. If the memory is dynamically
> > > > > allocated by the program, the program *must* free it before the program
> > > > > exits. This is enforced by the verifier.
> > > > >
> > > > > The added APIs are:
> > > > >
> > > > > long bpf_dynptr_from_mem(void *data, u32 size, struct bpf_dynptr *ptr);
> > > > > long bpf_malloc(u32 size, struct bpf_dynptr *ptr);
> > > > > void bpf_free(struct bpf_dynptr *ptr);
> > > > >
> > > > > This patch sets up the verifier to support dynptrs. Dynptrs will always
> > > > > reside on the program's stack frame. As such, their state is tracked
> > > > > in their corresponding stack slot, which includes the type of dynptr
> > > > > (DYNPTR_LOCAL vs. DYNPTR_MALLOC).
> > > > >
> > > > > When the program passes in an uninitialized dynptr (ARG_PTR_TO_DYNPTR |
> > > > > MEM_UNINIT), the stack slots corresponding to the frame pointer
> > > > > where the dynptr resides at is marked as STACK_DYNPTR. For helper functions
> > > > > that take in iniitalized dynptrs (such as the next patch in this series
> > > > > which supports dynptr reads/writes), the verifier enforces that the
> > > > > dynptr has been initialized by checking that their corresponding stack
> > > > > slots have been marked as STACK_DYNPTR. Dynptr release functions
> > > > > (eg bpf_free) will clear the stack slots. The verifier enforces at program
> > > > > exit that there are no dynptr stack slots that need to be released.
> > > > >
> > > > > There are other constraints that are enforced by the verifier as
> > > > > well, such as that the dynptr cannot be written to directly by the bpf
> > > > > program or by non-dynptr helper functions. The last patch in this series
> > > > > contains tests that trigger different cases that the verifier needs to
> > > > > successfully reject.
> > > > >
> > > > > Signed-off-by: Joanne Koong <joannelkoong@xxxxxxxxx>
> > > > > ---
> > > > >  include/linux/bpf.h            |  74 ++++++++-
> > > > >  include/linux/bpf_verifier.h   |  18 +++
> > > > >  include/uapi/linux/bpf.h       |  40 +++++
> > > > >  kernel/bpf/helpers.c           |  88 +++++++++++
> > > > >  kernel/bpf/verifier.c          | 266 ++++++++++++++++++++++++++++++++-
> > > > >  scripts/bpf_doc.py             |   2 +
> > > > >  tools/include/uapi/linux/bpf.h |  40 +++++
> > > > >  7 files changed, 521 insertions(+), 7 deletions(-)
> > > > >

[...]

> > > > > + *     Description
> > > > > + *             Dynamically allocate memory of *size* bytes.
> > > > > + *
> > > > > + *             Every call to bpf_malloc must have a corresponding
> > > > > + *             bpf_free, regardless of whether the bpf_malloc
> > > > > + *             succeeded.
> > > > > + *
> > > > > + *             The maximum *size* supported is DYNPTR_MAX_SIZE.
> > > > > + *     Return
> > > > > + *             0 on success, -ENOMEM if there is not enough memory for the
> > > > > + *             allocation, -EINVAL if the size is 0 or exceeds DYNPTR_MAX_SIZE.
> > > > > + *
> > > > > + * void bpf_free(struct bpf_dynptr *ptr)
> > > >
> > > > thinking about the next patch set that will add storing this malloc
> > > > dynptr into the map, bpf_free() will be a lie, right? As it will only
> > > > decrement a refcnt, not necessarily free it, right? So maybe just
> > > > generic bpf_dynptr_put() or bpf_malloc_put() or something like that is
> > > > a bit more "truthful"?
> > > I like the simplicity of bpf_free(), but I can see how that might be
> > > confusing. What are your thoughts on "bpf_dynptr_free()"? Since when
> > > we get into dynptrs that are stored in maps vs. dynptrs stored
> > > locally, calling bpf_dynptr_free() frees (invalidates) your local
> > > dynptr even if it doesn't free the underlying memory if it still has
> > > valid refcounts on it? To me, "malloc" and "_free" go more intuitively
> > > together as a pair.
> >
> > Sounds good to me (though let's use _dynptr() as a suffix
> > consistently). I also just realized that maybe we should call
> > bpf_malloc() a bpf_malloc_dynptr() instead. I can see how we might
> > want to enable plain bpf_malloc() with statically known size (similar
> > to statically known bpf_ringbuf_reserve()) for temporary local malloc
> > with direct memory access? So bpf_malloc_dynptr() would be a
> > dynptr-enabled counterpart to fixed-sized bpf_malloc()? And then
> > bpf_free() will work with direct pointer returned from bpf_malloc(),
> > while bpf_free_dynptr() will work with dynptr returned from
> > bpf_malloc_dynptr().
> I see! What is the advantage of a plain bpf_malloc()? Is it that it's
> a more ergonomic API (you get back a direct pointer to the data
> instead of getting back a dynptr and then having to call
> bpf_dynptr_data to get direct access) and you don't have to allocate
> extra bytes for refcounting?
>

That, but also I was thinking it would be good to have a simple way
for temporary (active for the duration of a single BPF program run)
buffer, instead of having to rely on per-CPU array, that would work
even in the presence of CPU migrations (per-cpu array won't) for
sleepable BPF programs. But then I recalled that we disable migrations
for sleepable, so there are not many real advantages of such form of
malloc/free. So please disregard.

> I will rename this to bpf_malloc_dynptr() and bpf_free_dynptr().
> >
> > > >
> > > > > + *     Description
> > > > > + *             Free memory allocated by bpf_malloc.
> > > > > + *
> > > > > + *             After this operation, *ptr* will be an invalidated dynptr.
> > > > > + *     Return
> > > > > + *             Void.
> > > > >   */

[...]

> > > > > @@ -5572,6 +5758,40 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
> > > > >                 bool zero_size_allowed = (arg_type == ARG_CONST_SIZE_OR_ZERO);
> > > > >
> > > > >                 err = check_mem_size_reg(env, reg, regno, zero_size_allowed, meta);
> > > > > +       } else if (arg_type_is_dynptr(arg_type)) {
> > > > > +               bool initialized = check_dynptr_init(env, reg, arg_type);
> > > > > +
> > > > > +               if (type_is_uninit_mem(arg_type)) {
> > > > > +                       if (initialized) {
> > > > > +                               verbose(env, "Arg #%d dynptr cannot be an initialized dynptr\n",
> > > > > +                                       arg + 1);
> > > > > +                               return -EINVAL;
> > > > > +                       }
> > > > > +                       meta->raw_mode = true;
> > > > > +                       err = check_helper_mem_access(env, regno, BPF_DYNPTR_SIZE, false, meta);
> > > > > +                       /* For now, we do not allow dynptrs to point to existing
> > > > > +                        * refcounted memory
> > > > > +                        */
> > > > > +                       if (reg_type_may_be_refcounted_or_null(regs[BPF_REG_1].type)) {
> > > >
> > > > hard-coded BPF_REG_1?
> > >
> > > I'm viewing this as a temporary line because one of the patches in a
> > > later dynptr patchset will enable support for local dynptrs to point
> > > to existing refcounted memory. The alternative is to add a new
> > > bpf_type_flag like NO_REFCOUNT and then remove that flag later. What
> > > are your thoughts?
> >
> > my concern and confusion was that it's a hard-coded BPF_REG_1 instead
> > of using arg to derive register itself. Why making unnecessary
> > assumptions about this always being a first argument?
> I think otherwise we need to add a temporary bpf_type_flag that
> specifies that an arg cannot be refcounted, and then when we allow
> local dynptrs to point to refcounted memory later on, we'll need to
> remove this flag and the verifier checks associated with it. But
> overall, I agree with you that we should just add this bpf_type_flag
> to this patchset rather than using BPF_REG_1 as a temporary solution -
> I will make this change for v2!

Ok, I'll wait for v2 as I still can't understand this bit, tbh.

> >
> > > >
> > > > > +                               verbose(env, "Arg #%d dynptr memory cannot be potentially refcounted\n",
> > > > > +                                       arg + 1);
> > > > > +                               return -EINVAL;
> > > > > +                       }
> > > > > +               } else {
> > > > > +                       if (!initialized) {
> > > > > +                               char *err_extra = "";
> > > >

[...]



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux