Re: [PATCHv3 bpf 2/2] selftests/bpf: Add test for early update in prog_array_map_poke_run

Jiri Olsa <olsajiri@xxxxxxxxx> · Tue, 5 Dec 2023 09:43:04 +0100



On Mon, Dec 04, 2023 at 09:16:52PM -0800, Yonghong Song wrote:
> 
> On 12/3/23 3:48 PM, Jiri Olsa wrote:
> > Adding test that tries to trigger the BUG_ON during early map update
> > in prog_array_map_poke_run function.
> > 
> > The idea is to share prog array map between thread that constantly
> > updates it and another one loading a program that uses that prog
> > array.
> > 
> > Eventually we will hit a place where the program is ok to be updated
> > (poke->tailcall_target_stable check) but the address is still not
> > registered in kallsyms, so the bpf_arch_text_poke returns -EINVAL
> > and cause imbalance for the next tail call update check, which will
> > fail with -EBUSY in bpf_arch_text_poke as described in previous fix.
> > 
> > Acked-by: Ilya Leoshkevich <iii@xxxxxxxxxxxxx>
> > Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx>
> > ---
> >   .../selftests/bpf/prog_tests/tailcall_poke.c  | 74 +++++++++++++++++++
> >   .../selftests/bpf/progs/tailcall_poke.c       | 32 ++++++++
> >   2 files changed, 106 insertions(+)
> >   create mode 100644 tools/testing/selftests/bpf/prog_tests/tailcall_poke.c
> >   create mode 100644 tools/testing/selftests/bpf/progs/tailcall_poke.c
> > 
> > diff --git a/tools/testing/selftests/bpf/prog_tests/tailcall_poke.c b/tools/testing/selftests/bpf/prog_tests/tailcall_poke.c
> > new file mode 100644
> > index 000000000000..f7e2c09fd772
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/prog_tests/tailcall_poke.c
> > @@ -0,0 +1,74 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include <unistd.h>
> > +#include <test_progs.h>
> > +#include "tailcall_poke.skel.h"
> > +
> > +#define JMP_TABLE "/sys/fs/bpf/jmp_table"
> > +
> > +static int thread_exit;
> > +
> > +static void *update(void *arg)
> > +{
> > +	__u32 zero = 0, prog1_fd, prog2_fd, map_fd;
> > +	struct tailcall_poke *call = arg;
> > +
> > +	map_fd = bpf_map__fd(call->maps.jmp_table);
> > +	prog1_fd = bpf_program__fd(call->progs.call1);
> > +	prog2_fd = bpf_program__fd(call->progs.call2);
> > +
> > +	while (!thread_exit) {
> > +		bpf_map_update_elem(map_fd, &zero, &prog1_fd, BPF_ANY);
> > +		bpf_map_update_elem(map_fd, &zero, &prog2_fd, BPF_ANY);
> > +	}
> > +
> > +	return NULL;
> > +}
> > +
> > +void test_tailcall_poke(void)
> > +{
> > +	struct tailcall_poke *call, *test;
> > +	int err, cnt = 10;
> > +	pthread_t thread;
> > +
> > +	unlink(JMP_TABLE);
> > +
> > +	call = tailcall_poke__open_and_load();
> > +	if (!ASSERT_OK_PTR(call, "tailcall_poke__open"))
> > +		return;
> > +
> > +	err = bpf_map__pin(call->maps.jmp_table, JMP_TABLE);
> > +	if (!ASSERT_OK(err, "bpf_map__pin"))
> > +		goto out;
> 
> Just curious. What is the reason having bpf_map__pin() here
> and below? I tried and it looks like removing bpf_map__pin()
> and below bpf_map__set_pin_path() will make reproducing
> the failure hard/impossible.

yes, it's there to share the jmp_table map between the two
skeleton instances, so the update thread changes the same
jmp_table map that's used in the skeleton we load in the
while loop below

I'll add some comments to the test

jirka

> 
> > +
> > +	err = pthread_create(&thread, NULL, update, call);
> > +	if (!ASSERT_OK(err, "new toggler"))
> > +		goto out;
> > +
> > +	while (cnt--) {
> > +		test = tailcall_poke__open();
> > +		if (!ASSERT_OK_PTR(test, "tailcall_poke__open"))
> > +			break;
> > +
> > +		err = bpf_map__set_pin_path(test->maps.jmp_table, JMP_TABLE);
> > +		if (!ASSERT_OK(err, "bpf_map__pin")) {
> > +			tailcall_poke__destroy(test);
> > +			break;
> > +		}
> > +
> > +		bpf_program__set_autoload(test->progs.test, true);
> > +		bpf_program__set_autoload(test->progs.call1, false);
> > +		bpf_program__set_autoload(test->progs.call2, false);
> > +
> > +		err = tailcall_poke__load(test);
> > +		tailcall_poke__destroy(test);
> > +		if (!ASSERT_OK(err, "tailcall_poke__load"))
> > +			break;
> > +	}
> > +
> > +	thread_exit = 1;
> > +	ASSERT_OK(pthread_join(thread, NULL), "pthread_join");
> > +
> > +out:
> > +	bpf_map__unpin(call->maps.jmp_table, JMP_TABLE);
> > +	tailcall_poke__destroy(call);
> > +}

SNIP