On Mon, Feb 12, 2024 at 11:43:42PM +0100, Kumar Kartikeya Dwivedi wrote: > On Mon, 12 Feb 2024 at 21:53, David Vernet <void@xxxxxxxxxxxxx> wrote: > > > > On Thu, Feb 01, 2024 at 04:21:09AM +0000, Kumar Kartikeya Dwivedi wrote: > > > Add tests for the runtime cleanup support for exceptions, ensuring that > > > resources are correctly identified and released when an exception is > > > thrown. Also, we add negative tests to exercise corner cases the > > > verifier should reject. > > > > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> > > > --- > > > tools/testing/selftests/bpf/DENYLIST.aarch64 | 1 + > > > tools/testing/selftests/bpf/DENYLIST.s390x | 1 + > > > .../bpf/prog_tests/exceptions_cleanup.c | 65 +++ > > > .../selftests/bpf/progs/exceptions_cleanup.c | 468 ++++++++++++++++++ > > > .../bpf/progs/exceptions_cleanup_fail.c | 154 ++++++ > > > .../selftests/bpf/progs/exceptions_fail.c | 13 - > > > 6 files changed, 689 insertions(+), 13 deletions(-) > > > create mode 100644 tools/testing/selftests/bpf/prog_tests/exceptions_cleanup.c > > > create mode 100644 tools/testing/selftests/bpf/progs/exceptions_cleanup.c > > > create mode 100644 tools/testing/selftests/bpf/progs/exceptions_cleanup_fail.c > > > > > > diff --git a/tools/testing/selftests/bpf/DENYLIST.aarch64 b/tools/testing/selftests/bpf/DENYLIST.aarch64 > > > index 5c2cc7e8c5d0..6fc79727cd14 100644 > > > --- a/tools/testing/selftests/bpf/DENYLIST.aarch64 > > > +++ b/tools/testing/selftests/bpf/DENYLIST.aarch64 > > > @@ -1,6 +1,7 @@ > > > bpf_cookie/multi_kprobe_attach_api # kprobe_multi_link_api_subtest:FAIL:fentry_raw_skel_load unexpected error: -3 > > > bpf_cookie/multi_kprobe_link_api # kprobe_multi_link_api_subtest:FAIL:fentry_raw_skel_load unexpected error: -3 > > > exceptions # JIT does not support calling kfunc bpf_throw: -524 > > > +exceptions_unwind # JIT does not support calling kfunc bpf_throw: -524 > > > fexit_sleep # The test never returns. The remaining tests cannot start. > > > kprobe_multi_bench_attach # needs CONFIG_FPROBE > > > kprobe_multi_test # needs CONFIG_FPROBE > > > diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x > > > index 1a63996c0304..f09a73dee72c 100644 > > > --- a/tools/testing/selftests/bpf/DENYLIST.s390x > > > +++ b/tools/testing/selftests/bpf/DENYLIST.s390x > > > @@ -1,5 +1,6 @@ > > > # TEMPORARY > > > # Alphabetical order > > > exceptions # JIT does not support calling kfunc bpf_throw (exceptions) > > > +exceptions_unwind # JIT does not support calling kfunc bpf_throw (exceptions) > > > get_stack_raw_tp # user_stack corrupted user stack (no backchain userspace) > > > stacktrace_build_id # compare_map_keys stackid_hmap vs. stackmap err -2 errno 2 (?) > > > diff --git a/tools/testing/selftests/bpf/prog_tests/exceptions_cleanup.c b/tools/testing/selftests/bpf/prog_tests/exceptions_cleanup.c > > > new file mode 100644 > > > index 000000000000..78df037b60ea > > > --- /dev/null > > > +++ b/tools/testing/selftests/bpf/prog_tests/exceptions_cleanup.c > > > @@ -0,0 +1,65 @@ > > > +#include "bpf/bpf.h" > > > +#include "exceptions.skel.h" > > > +#include <test_progs.h> > > > +#include <network_helpers.h> > > > + > > > +#include "exceptions_cleanup.skel.h" > > > +#include "exceptions_cleanup_fail.skel.h" > > > + > > > +static void test_exceptions_cleanup_fail(void) > > > +{ > > > + RUN_TESTS(exceptions_cleanup_fail); > > > +} > > > + > > > +void test_exceptions_cleanup(void) > > > +{ > > > + LIBBPF_OPTS(bpf_test_run_opts, ropts, > > > + .data_in = &pkt_v4, > > > + .data_size_in = sizeof(pkt_v4), > > > + .repeat = 1, > > > + ); > > > + struct exceptions_cleanup *skel; > > > + int ret; > > > + > > > + if (test__start_subtest("exceptions_cleanup_fail")) > > > + test_exceptions_cleanup_fail(); > > > > RUN_TESTS takes care of doing test__start_subtest(), etc. You should be > > able to just call RUN_TESTS(exceptions_cleanup_fail) directly here. > > > > Ack, will fix. > > > > + > > > + skel = exceptions_cleanup__open_and_load(); > > > + if (!ASSERT_OK_PTR(skel, "exceptions_cleanup__open_and_load")) > > > + return; > > > + > > > + ret = exceptions_cleanup__attach(skel); > > > + if (!ASSERT_OK(ret, "exceptions_cleanup__attach")) > > > + return; > > > + > > > +#define RUN_EXC_CLEANUP_TEST(name) \ > > > > Should we add a call to if (test__start_subtest(#name)) to this macro? > > > > Makes sense, will change this. > > > > [...] > > > + > > > +SEC("tc") > > > +int exceptions_cleanup_percpu_obj(struct __sk_buff *ctx) > > > +{ > > > + struct { int i; } *p; > > > + > > > + p = bpf_percpu_obj_new(typeof(*p)); > > > + MARK_RESOURCE(&p, RES_SPILL); > > > + bpf_throw(VAL); > > > > It would be neat if we could have the bpf_throw() kfunc signature be > > marked as __attribute__((noreturn)) and have things work correctly; > > meaning you wouldn't have to even return a value here. The verifier > > should know that bpf_throw() is terminal, so it should be able to prune > > any subsequent instructions as unreachable anyways. > > > > Originally, I was tagging the kfunc as noreturn, but Alexei advised > against it in > https://lore.kernel.org/bpf/CAADnVQJtUD6+gYinr+6ensj58qt2LeBj4dvT7Cyu-aBCafsP5g@xxxxxxxxxxxxxx > ... so I have dropped it since. I see. Ok, we can ignore this for now, though I think we should consider revisiting this at some point once we've clarified the rules behind the implicit prologue/epilogue. Being able to actually specify noreturn really can make a difference in performance in some cases. > Right now, the verifier will do dead code elimination ofcourse, but > sometimes the compiler does generate code that is tricky or unexpected > (like putting the bpf_throw instruction as the final one instead of > exit or jmp if somehow it can prove that bpf_throw will be taken by > all paths) for the verifier if the bpf_throw is noreturn. Even though Got it. As long as the verifier does dead-code elimination on that path, that's really the most important thing. > this would have the same effect at runtime (if the analysis of the > compiler is not wrong), there were some places we would have to modify > so that the compiler does not get confused. > > Overall I'm not opposed to this, but I think we need more consensus > before flipping the flag. Since this can be changed later and the > necessary changes can be made in the verifier (just a couple of places > which expect exit or jmp to final insns), I decided to move ahead > without noreturn. Understood, thanks for explaining. Leaving off noreturn for now is fine with me. > > > + return !p; > > > +} > > > + > > > +SEC("tc") > > > +int exceptions_cleanup_ringbuf(struct __sk_buff *ctx) > > > +{ > > > + void *p; > > > + > > > + p = bpf_ringbuf_reserve(&ringbuf, 8, 0); > > > + MARK_RESOURCE(&p, RES_SPILL); > > > + bpf_throw(VAL); > > > + return 0; > > > +} > > > + > > > +SEC("tc") > > > +int exceptions_cleanup_reg(struct __sk_buff *ctx) > > > +{ > > > + void *p; > > > + > > > + p = bpf_ringbuf_reserve(&ringbuf, 8, 0); > > > + MARK_RESOURCE(p, RES_REG); > > > + bpf_throw(VAL); > > > + if (p) > > > + bpf_ringbuf_discard(p, 0); > > > > Does the prog fail to load if you don't have this bpf_ringbuf_discard() > > check? I assume not given that in > > exceptions_cleanup_null_or_ptr_do_ptr() and elsewhere we do a reserve > > without discarding. Is there some subtle stack state difference here or > > something? > > > > So I will add comments explaining this, since I realized this confused > you in a couple of places, but basically if I didn't do a discard > here, the compiler wouldn't save the value of p across the bpf_throw > call. So it may end up in some caller-saved register (R1-R5) and since > bpf_throw needs things to be either saved in the stack or in > callee-saved regs (R6-R9) to be able to do the stack unwinding, we > would not be able to test the case where the resource is held in > R6-R9. > > In a correctly written program, in the path where bpf_throw is not > done, you will always have some cleanup code (otherwise your program > wouldn't pass), so the value should always end up being preserved > across a bpf_throw call (this is kind of why Alexei was sort of > worried about noreturn, because in that case the compiler may decide > to not preserve it for the bpf_throw path). > You cannot just leak a resource acquired before bpf_throw in the path > where exception is not thrown. Ok, that makes sense. I suppose another way to frame this would be to consider it in a typical scheduling scenario: struct task_ctx *lookup_task_ctx(struct task_struct *p) { struct task_ctx *taskc; s32 pid = p->pid; taskc = bpf_map_lookup_elem(&task_data, &pid); if (!taskc) bpf_throw(-ENOENT); // Verifier return taskc; } void BPF_STRUCT_OPS(sched_stopping, struct task_struct *p, bool runnable) { struct task_ctx *taskc; taskc = lookup_task_ctx(p) /* scale the execution time by the inverse of the weight and charge */ p->scx.dsq_vtime += (bpf_ktime_get_ns() - taskc->running_at) * 100 / p->scx.weight; } We're not dropping a reference here, but taskc is preserved across the bpf_throw() path, so the same idea applies. > Also, I think the test is a bit fragile, I should probably rewrite it > in inline assembly, because while the compiler chooses to hold it in a > register here, it is not bound to do so in this case. To that point, I wonder if it would be useful or possible to come up with some kind of a macro that allows us to specify a list of variables that must be preserved after a bpf_throw() call? Not sure how or if that would work exactly. > > > [...] > > > > > > -SEC("?tc") > > > -__failure __msg("Unreleased reference") > > > -int reject_with_reference(void *ctx) > > > -{ > > > - struct foo *f; > > > - > > > - f = bpf_obj_new(typeof(*f)); > > > - if (!f) > > > - return 0; > > > - bpf_throw(0); > > > > Hmm, so why is this a memory leak exactly? Apologies if this is already > > explained clearly elsewhere in the stack. > > > > I will add comments around some of these to better explain this in the > non-RFC v1. > Basically, this program is sort of unrealistic (since it's always > throwing, and not really cleaning up the object since there is no > other path except the one with bpf_throw). So the compiler ends up > putting 'f' in a caller-saved register, during release_reference we > don't find it after bpf_throw has been processed (since caller-saved > regs have been cleared due to kfunc processing, and we generate frame > descriptors after check_kfunc_call, basically simulating the state > where only preserved state after the call is observed at runtime), but > the reference state still lingers around for 'f', so you get this > "Unreleased reference" error later when check_reference_leak is hit. > > It's just trying to exercise the case where the pointer tied to a > reference state has been lost in verifier state, and that we return an > error in such a case and don't succeed in verifying the program > accidently (because there is no way we can recover the value to free > at runtime). Makes total sense, thanks a lot for explaining! This looks great, I'm really excited to use it.
Attachment:
signature.asc
Description: PGP signature