On Wed, Nov 23, 2022 at 3:31 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Wed, Nov 23, 2022 at 2:39 PM Stanislav Fomichev <sdf@xxxxxxxxxx> wrote: > > > > On Wed, Nov 23, 2022 at 12:39 PM Alexei Starovoitov > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > On Wed, Nov 23, 2022 at 12:08 PM Stanislav Fomichev <sdf@xxxxxxxxxx> wrote: > > > > > > > > Jiri reports broken test_progs after recent commit 68f8e3d4b916 > > > > ("selftests/bpf: Make sure zero-len skbs aren't redirectable"). > > > > Apparently we don't remount debugfs when we switch back networking namespace. > > > > Let's explicitly mount /sys/kernel/debug. > > > > > > > > 0: https://lore.kernel.org/bpf/63b85917-a2ea-8e35-620c-808560910819@xxxxxxxx/T/#ma66ca9c92e99eee0a25e40f422489b26ee0171c1 > > > > > > > > Fixes: a30338840fa5 ("selftests/bpf: Move open_netns() and close_netns() into network_helpers.c") > > > > Reported-by: Jiri Olsa <olsajiri@xxxxxxxxx> > > > > Signed-off-by: Stanislav Fomichev <sdf@xxxxxxxxxx> > > > > --- > > > > tools/testing/selftests/bpf/network_helpers.c | 4 ++++ > > > > 1 file changed, 4 insertions(+) > > > > > > > > diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c > > > > index bec15558fd93..1f37adff7632 100644 > > > > --- a/tools/testing/selftests/bpf/network_helpers.c > > > > +++ b/tools/testing/selftests/bpf/network_helpers.c > > > > @@ -426,6 +426,10 @@ static int setns_by_fd(int nsfd) > > > > if (!ASSERT_OK(err, "mount /sys/fs/bpf")) > > > > return err; > > > > > > > > + err = mount("debugfs", "/sys/kernel/debug", "debugfs", 0, NULL); > > > > + if (!ASSERT_OK(err, "mount /sys/kernel/debug")) > > > > + return err; > > > > + > > > > return 0; > > > > } > > > > > > Thanks. > > > It fixes part of it but it's still racy. > > > I see: > > > do_read:FAIL:open open /sys/fs/bpf/bpf_iter_test1 failed: No such file > > > or directory > > > > > > I suspect it happens when iter tests are running while test_empty_skb > > > is cleaning the netns. > > > > > > So I've added: > > > -void test_empty_skb(void) > > > +void serial_test_empty_skb(void) > > > -void test_xdp_do_redirect(void) > > > +void serial_test_xdp_do_redirect(void) > > > -void test_xdp_synproxy(void) > > > +void serial_test_xdp_synproxy(void) > > > > > > to stop the bleeding and applied. > > > > Not sure I understand where the race is coming from and no luck > > reproducing locally :-( > > Looks like we run the tests in the forked workers, so that > > unshare(mountns) shouldn't theoretically affect the rest, but > > obviously I'm missing something.. > > I'm equally confused. [..] > If what you're describing was the case than the bug of > not mounting debugfs wouldn't have caused issues in parallel run. The workers are reused and aren't respawned for every test. If some of them runs empty_skb test, its debugfs is not mounted anymore and any other test that requires debugfs (and lands on this worker) will be broken. > That close_netns is somehow messing with other forked processes. Yeah, agreed, looks like /sys/fs/bpf disappears while bpf_iter test is running. I can try to stick open/close_netns in a bunch of places in test_progs to see if I can trigger the issue. (or maybe have a test in parallel that does while(1) {openns(); closens(); }