Re: [PATCH v3 bpf-next 00/15] bpf: BPF specific memory allocator, UAPI in particular

Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> · Tue, 30 Aug 2022 07:02:52 +0200

On Tue, 30 Aug 2022 at 05:35, Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Mon, Aug 29, 2022 at 6:40 PM Delyan Kratunov <delyank@xxxxxx> wrote:
> >
> > Yes, if we populate used_allocators on load and copy them to inner maps, this might
> > work. It requires the most conservative approach where loading and unloading a
> > program in a loop would add its allocators to the visible pinned maps, accumulating
> > allocators we can't release until the map is gone.
> >
> > However, I thought you wanted to walk the values instead to prevent this abuse. At
> > least that's the understanding I was operating under.
> >
> > If a program has the max number of possible allocators and we just load/unload it in
> > a loop, with a visible pinned map, the used_allocators list of that map can easily
> > skyrocket.
>
> Right. That will be an issue if we don't trim the list
> at prog unload.
>
> > > Sorry I confused myself and others with release_uref.
> > > I meant map_poke_untrack-like call.
> > > When we drop refs from used maps in __bpf_free_used_maps
> > > we walk all elements.
> > > Similar idea here.
> > > When prog is unloaded it cleans up all objects it allocated
> > > and stored into maps before dropping refcnt-s
> > > in prog->used_allocators.
> >
> > For an allocator that's visible from multiple programs (say, it's in a map-of-maps),
> > how would we know which values were allocated by which program? Do we forbid shared
> > allocators outright?
>
> Hopefully we don't need to forbid shared allocators and
> allow map-in-map to contain kptr local.
>
> > Separately, I think I just won't allow allocators as inner maps, that's for another
> > day too (though it may work just fine).
> >
> > Perfect, enemy of the good, something-something.
>
> Right, but if we can allow that with something simple
> it would be nice.
>
> After a lot of head scratching realized that
> walk-all-elems-and-kptr_xchg approach doesn't work,
> because prog_A and prog_B might share a map and when
> prog_A is unloaded and trying to do kptr_xchg on all elems
> the prog_B might kptr_xchg as well and walk_all loop
> will miss kptrs.

Agreed, I can't see it working either.

> prog->used_allocators[] approach is broken too.
> Since the prog_B (in the example above) might see
> objects that were allocated from prog_A's allocators.
> prog->used_allocators at load time is incorrect.
>
> To prevent all of these issues how about we
> restrict kptr local to contain a pointer only
> from one allocator.
> When parsing map's BTF if there is only one kptr local
> in the map value the equivalent of map->used_allocators[]
> will guarantee to contain only one allocator.
> Two kptr locals in the map value -> potentially two allocators.
>
> So here is new proposal:
>

Thanks for the proposal, Alexei. I think we're getting close to a
solution, but still some comments below.

> At load time the verifier walks all kptr_xchg(map_value, obj)
> and adds obj's allocator to
> map->used_allocators <- {kptr_offset, allocator};
> If kptr_offset already exists -> failure to load.
> Allocator can probably be a part of struct bpf_map_value_off_desc.
>
> In other words the pairs of {kptr_offset, allocator}
> say 'there could be an object from that allocator in
> that kptr in some map values'.
>
> Do nothing at prog unload.
>
> At map free time walk all elements and free kptrs.
> Finally drop allocator refcnts.
>

Yes, this should be possible.
It's quite easy to capture the map_ptr for the allocated local kptr.
Limiting each local kptr to one allocator is probably fine, at least for a v1.

One problem I see is how it works when the allocator map is an inner map.
Then, it is not possible to find the backing allocator instance at
verification time, hence not possible to take the reference to it in
map->used_allocators.
But let's just assume that is disallowed for now.

The other problem I see is that when the program just does
kptr_xchg(map_value, NULL), we may not have allocator info from
kptr_offset at that moment. Allocating prog which fills
used_allocators may be verified later. We _can_ reject this, but it
makes everything fragile (dependent on which order you load programs
in), which won't be great. You can then use this lost info to make
kptr disjoint from allocator lifetime.

Let me explain through an example.

Consider this order to set up the programs:
One allocator map A.
Two hashmaps M1, M2.
Three programs P1, P2, P3.

P1 uses M1, M2.
P2 uses A, M1.
P3 uses M2.

Sequence:
map_create A, M1, M2.

Load P1, uses M1, M2. What this P1 does is:
p = kptr_xchg(M1.value, NULL);
kptr_xchg(M2.value, p);

So it moves the kptr in M1 into M2. The problem is at this point
kptr_offset is not populated, so we cannot fill used_allocators of M2
as we cannot track which allocator is used to fill M1.value. We saw
nothing filling it yet.

Next, load P3. It does:
p = kptr_xchg(M2.value, NULL);
unit_free(p); // let's assume p has bpf_mem_alloc ptr behind itself so
this is ok if allocator is alive.

Again, M2.used_allocators is empty. Nothing is filled into it.

Next, load P2.
p = alloc(&A, ...);
kptr_xchg(M1.value, p);

Now, M1.used_allocators is filled with allocator ref and kptr_offset.
But M2.used_allocators will remain unfilled.

Now, run programs in sequence of P2, then P1. This will allocate from
A, and move the ref to M1, then to M2. But only P1 and P2 have
references to M1 so it keeps the allocator alive. However, now unload
both P1 and P2.
P1, P2, A, allocator of A, M1 all can be freed after RCU gp wait. M2
is still held by loaded P3.

Now, M2.used_allocators is empty. P3 is using it, and it is holding
allocation from allocator A. Both M1 and A are freed.
When P3 runs now, it can kptr_xchg and try to free it, and the same
uaf happens again.
If not that, uaf when M2 is freed and it does unit_free on the alive local kptr.

--

Will this case be covered by your approach? Did I miss something?

The main issue is that this allocator info can be lost depending on
how you verify a set of programs. It would not be lost if we verified
in order P2, P1, P3 instead of the current P1, P3, P2.

So we might have to teach the verifier to identify kptr_xchg edges
between maps, and propagate any used_allocators to the other map? But
it's becoming too complicated.

You _can_ reject loads of programs when you don't find kptr_offset
populated on seeing kptr_xchg(..., NULL), but I don't think this is
practical either. It makes the things sensitive to program
verification order, which would be confusing for users.