Re: [PATCH bpf-next v1 2/4] bpf: Prepare prog_test_struct kfuncs for runtime tests

Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> · Thu, 12 May 2022 00:37:24 +0530

On Wed, May 11, 2022 at 11:23:59PM IST, Alexei Starovoitov wrote:
> On Tue, May 10, 2022 at 11:01 PM Kumar Kartikeya Dwivedi
> <memxor@xxxxxxxxx> wrote:
> >
> > On Wed, May 11, 2022 at 10:07:35AM IST, Alexei Starovoitov wrote:
> > > On Tue, May 10, 2022 at 2:17 PM Kumar Kartikeya Dwivedi
> > > <memxor@xxxxxxxxx> wrote:
> > > >
> > > > In an effort to actually test the refcounting logic at runtime, add a
> > > > refcount_t member to prog_test_ref_kfunc and use it in selftests to
> > > > verify and test the whole logic more exhaustively.
> > > >
> > > > To ensure reading the count to verify it remains stable, make
> > > > prog_test_ref_kfunc a per-CPU variable, so that inside a BPF program the
> > > > count can be read reliably based on number of acquisitions made. Then,
> > > > pairing them with releases and reading from the global per-CPU variable
> > > > will allow verifying whether release operations put the refcount.
> > >
> > > The patches look good, but the per-cpu part is a puzzle.
> > > The test is not parallel. Everything looks sequential
> > > and there are no races.
> > > It seems to me if it was
> > > static struct prog_test_ref_kfunc prog_test_struct = {..};
> > > and none of [bpf_]this_cpu_ptr()
> > > the test would work the same way.
> > > What am I missing?
> >
> > You are not missing anything. It would work the same. I just made it per-CPU for
> > the off chance that someone runs ./test_progs -t map_kptr in parallel on the
> > same machine. Then one or both might fail, since count won't just be inc/dec by
> > us, and reading it would produce something other than what we expect.
>
> I see. You should have mentioned that in the commit log.
> But how per-cpu helps in this case?
> prog_run is executed with cpu=0, so both test_progs -t map_kptr
> will collide on the same cpu.

Right, I was thinking bpf_prog_run disabled preemption, so that would prevent
collisions, but it seems my knowledge is now outdated (only migration is
disabled). Also, just realising, we rely on observing a specific count across
test_run invocations, which won't be protected against for parallel runs
anyway.

> At the end it's the same. one or both might fail?
>
> In general all serial_ tests in test_progs will fail in
> parallel run.
> Even non-serial tests might fail.
> The non-serial tests are ok for test_progs -j.
> They're parallel between themselves, but there are no guarantees
> that every individual test can be run parallel with itself.
> Majority will probably be fine, but not all.
>

I'll drop it and go with a global struct.

> > One other benefit is getting non-ref PTR_TO_BTF_ID to prog_test_struct to
> > inspect cnt after releasing acquired pointer (using bpf_this_cpu_ptr), but that
> > can also be done by non-ref kfunc returning a pointer to it.
>
> Not following. non-ref == ptr_untrusted. That doesn't preclude

By non-ref PTR_TO_BTF_ID I meant normal (not untrusted) PTR_TO_BTF_ID with
ref_obj_id = 0.

bpf_this_cpu_ptr returns a normal PTR_TO_BTF_ID, not an untrusted one.

> bpf prog from reading refcnt directly, but disallows passing
> into helpers.
> So with non-percpu the following hack
>  bpf_kfunc_call_test_release(p);
>  if (p_cpu->cnt.refs.counter ...)
> wouldn't be necessary.
> The prog could release(p) and read p->cnt.refs.counter right after.

release(p) will kill p, so that won't work. I have a better idea, since
p->next points to itself, just loading that will give me a pointer I can
read after release(p).

As an aside, do you think we should change the behaviour of killing released
registers and skip it for refcounted PTR_TO_BTF_ID (perhaps mark it as untrusted
pointer instead, with ref_obj_id reset to zero)? So loads are allowed into it,
but passing into the kernel isn't, wdyt?

p = acq();	  // p.type = PTR_TO_BTF_ID, ref_obj_id=X
foo(p);		  // works
bar(p->a + p->b); // works
rel(p);		  // p.type = PTR_TO_BTF_ID | PTR_UNTRUSTED, ref_obj_id=0
		  // Instead of mark_reg_unknown(p)

There is still the case where you can do:
p2 = p->next;
rel(p);
p3 = p->next;

Now p2 is trusted PTR_TO_BTF_ID, while p3 is untrusted, but this is a separate
problem which requires a more general fix, and needs more discussion.

A bit of a digression, but I would like to know what you and other BPF
developers think.

So far my thinking (culminating towards an RFC) is this:

For a refcounted PTR_TO_BTF_ID, it is marked as trusted.

When loading from it, by default all loads yield untrusted pointers, except
those which are specifically marked with some annotation ("bpf_ptr_trust") which
indicates that parent holds reference to member pointer. This is a loose
description to mean that for the lifetime of trusted parent pointer, member
pointer may also be trusted. If lifetime can end (due to release), trusted
member pointers will become untrusted. If it cannot (e.g. function arguments),
it remains valid.

This will use BTF tags.
Known cases in the kernel which are useful and safe can be whitelisted.

Such loads yield trusted pointers linked to refcounted PTR_TO_BTF_ID. Linked
means the source refcounted PTR_TO_BTF_ID owns it.

When releasing PTR_TO_BTF_ID, all registers with same ref_obj_id, and all linked
PTR_TO_BTF_ID are marked as untrusted.

As an example:

struct foo {
	struct bar __ref *br;
	struct baz *bz;
};

struct foo *f = acq(); // f.type = PTR_TO_BTF_ID, ref_obj_id=X
br = f->br;	       // br.type = PTR_TO_BTF_ID, linked_to=X
bz = f->bz;	       // bz.type = PTR_TO_BTF_ID | PTR_UNTRUSTED
rel(f);		       // f.type = PTR_TO_BTF_ID | PTR_UNTRUSTED
		       // and since br.linked_to == f.ref_obj_id,
		       // br.type = PTR_TO_BTF_ID | PTR_UNTRUSTED

For trusted loads from br, linked_to will be same as X, so they will also be
marked as untrusted, and so on.

For tp_btf/LSM programs, pointer arguments will be non-refcounted trusted
PTR_TO_BTF_ID. All rules as above apply, but since it cannot be released,
trusted pointers obtained from them remain valid till BPF_EXIT.

I have no idea how much backwards compat this will break, or how much of it can
be tolerated.

> While with per-cpu approach you had to do
> p_cpu = bpf_this_cpu_ptr(&prog_test_struct);
> hack and rely on intimate knowledge of the kernel side.
>
> > If you feel it's not worth it, I will drop it in the next version.
>
> So far I see the downsides.
> Also check the CI. test_progs-no_alu32 fails:
> test_map_kptr_fail_prog:PASS:map_kptr__load must fail 0 nsec
> test_map_kptr_fail_prog:FAIL:expected error message unexpected error: -22
> Expected: Unreleased reference id=5 alloc_insn=18
> Verifier: 0: R1=ctx(off=0,imm=0) R10=fp0

Yes, noticed that. It is because alloc_insn= is different for no_alu32. I'll
drop the matching on specific insn idx number.

Thanks for your feedback.

--
Kartikeya