Re: [PATCH v3 bpf-next 00/15] bpf: BPF specific memory allocator, UAPI in particular

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Mon, 29 Aug 2022 20:34:51 -0700

On Mon, Aug 29, 2022 at 6:40 PM Delyan Kratunov <delyank@xxxxxx> wrote:
>
> Yes, if we populate used_allocators on load and copy them to inner maps, this might
> work. It requires the most conservative approach where loading and unloading a
> program in a loop would add its allocators to the visible pinned maps, accumulating
> allocators we can't release until the map is gone.
>
> However, I thought you wanted to walk the values instead to prevent this abuse. At
> least that's the understanding I was operating under.
>
> If a program has the max number of possible allocators and we just load/unload it in
> a loop, with a visible pinned map, the used_allocators list of that map can easily
> skyrocket.

Right. That will be an issue if we don't trim the list
at prog unload.

> > Sorry I confused myself and others with release_uref.
> > I meant map_poke_untrack-like call.
> > When we drop refs from used maps in __bpf_free_used_maps
> > we walk all elements.
> > Similar idea here.
> > When prog is unloaded it cleans up all objects it allocated
> > and stored into maps before dropping refcnt-s
> > in prog->used_allocators.
>
> For an allocator that's visible from multiple programs (say, it's in a map-of-maps),
> how would we know which values were allocated by which program? Do we forbid shared
> allocators outright?

Hopefully we don't need to forbid shared allocators and
allow map-in-map to contain kptr local.

> Separately, I think I just won't allow allocators as inner maps, that's for another
> day too (though it may work just fine).
>
> Perfect, enemy of the good, something-something.

Right, but if we can allow that with something simple
it would be nice.

After a lot of head scratching realized that
walk-all-elems-and-kptr_xchg approach doesn't work,
because prog_A and prog_B might share a map and when
prog_A is unloaded and trying to do kptr_xchg on all elems
the prog_B might kptr_xchg as well and walk_all loop
will miss kptrs.
prog->used_allocators[] approach is broken too.
Since the prog_B (in the example above) might see
objects that were allocated from prog_A's allocators.
prog->used_allocators at load time is incorrect.

To prevent all of these issues how about we
restrict kptr local to contain a pointer only
from one allocator.
When parsing map's BTF if there is only one kptr local
in the map value the equivalent of map->used_allocators[]
will guarantee to contain only one allocator.
Two kptr locals in the map value -> potentially two allocators.

So here is new proposal:

At load time the verifier walks all kptr_xchg(map_value, obj)
and adds obj's allocator to
map->used_allocators <- {kptr_offset, allocator};
If kptr_offset already exists -> failure to load.
Allocator can probably be a part of struct bpf_map_value_off_desc.

In other words the pairs of {kptr_offset, allocator}
say 'there could be an object from that allocator in
that kptr in some map values'.

Do nothing at prog unload.

At map free time walk all elements and free kptrs.
Finally drop allocator refcnts.

This approach allows sharing of allocators.
kptr local in map-in-map also should be fine.
If not we have a problem with bpf_map_value_off_desc
and map-in-map then.

The prog doesn't need to have a special used_allocator list,
since if bpf prog doesn't do kptr_xchg all allocated
objects will be freed during prog execution.
Instead since allocator is a different type of map it
should go into existing used_maps[] to make sure
we don't free allocator when prog is executing.

Maybe with this approach we won't even need to
hide the allocator pointer into the first 8 bytes.
For all pointers returned from kptr_xchg the verifier
will know which allocator is supposed to be used for freeing.

Thoughts?