On Wed, 2023-03-15 at 15:54 +0100, Ilya Leoshkevich wrote: > On Wed, 2023-03-15 at 11:54 +0100, Alexander Lobakin wrote: > > From: Alexander Lobakin <aleksander.lobakin@xxxxxxxxx> > > Date: Wed, 15 Mar 2023 10:56:25 +0100 > > > > > From: Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> > > > Date: Tue, 14 Mar 2023 16:54:25 -0700 > > > > > > > On Tue, Mar 14, 2023 at 11:52 AM Alexei Starovoitov > > > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > [...] > > > > > > > test_xdp_do_redirect:PASS:prog_run 0 nsec > > > > test_xdp_do_redirect:PASS:pkt_count_xdp 0 nsec > > > > test_xdp_do_redirect:PASS:pkt_count_zero 0 nsec > > > > test_xdp_do_redirect:FAIL:pkt_count_tc unexpected pkt_count_tc: > > > > actual > > > > 220 != expected 9998 > > > > test_max_pkt_size:PASS:prog_run_max_size 0 nsec > > > > test_max_pkt_size:PASS:prog_run_too_big 0 nsec > > > > close_netns:PASS:setns 0 nsec > > > > #289 xdp_do_redirect:FAIL > > > > Summary: 270/1674 PASSED, 30 SKIPPED, 1 FAILED > > > > > > > > Alex, > > > > could you please take a look at why it's happening? > > > > > > > > I suspect it's an endianness issue in: > > > > if (*metadata != 0x42) > > > > return XDP_ABORTED; > > > > but your patch didn't change that, > > > > so I'm not sure why it worked before. > > > > > > Sure, lemme fix it real quick. > > > > Hi Ilya, > > > > Do you have s390 testing setups? Maybe you could take a look, since > > I > > don't have one and can't debug it? Doesn't seem to be Endianness > > issue. > > I mean, I have this (the below patch), but not sure it will fix > > anything -- IIRC eBPF arch always matches the host arch ._. > > I can't figure out from the code what does happen wrongly :s And it > > happens only on s390. > > > > Thanks, > > Olek > > --- > > diff --git > > a/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c > > b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c > > index 662b6c6c5ed7..b21371668447 100644 > > --- a/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c > > +++ b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c > > @@ -107,7 +107,7 @@ void test_xdp_do_redirect(void) > > .attach_point = BPF_TC_INGRESS); > > > > memcpy(&data[sizeof(__u32)], &pkt_udp, sizeof(pkt_udp)); > > - *((__u32 *)data) = 0x42; /* metadata test value */ > > + *((__u32 *)data) = htonl(0x42); /* metadata test value */ > > > > skel = test_xdp_do_redirect__open(); > > if (!ASSERT_OK_PTR(skel, "skel")) > > diff --git > > a/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c > > b/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c > > index cd2d4e3258b8..2475bc30ced2 100644 > > --- a/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c > > +++ b/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c > > @@ -1,5 +1,6 @@ > > // SPDX-License-Identifier: GPL-2.0 > > #include <vmlinux.h> > > +#include <bpf/bpf_endian.h> > > #include <bpf/bpf_helpers.h> > > > > #define ETH_ALEN 6 > > @@ -28,7 +29,7 @@ volatile int retcode = XDP_REDIRECT; > > SEC("xdp") > > int xdp_redirect(struct xdp_md *xdp) > > { > > - __u32 *metadata = (void *)(long)xdp->data_meta; > > + __be32 *metadata = (void *)(long)xdp->data_meta; > > void *data_end = (void *)(long)xdp->data_end; > > void *data = (void *)(long)xdp->data; > > > > @@ -44,7 +45,7 @@ int xdp_redirect(struct xdp_md *xdp) > > if (metadata + 1 > data) > > return XDP_ABORTED; > > > > - if (*metadata != 0x42) > > + if (*metadata != __bpf_htonl(0x42)) > > return XDP_ABORTED; > > > > if (*payload == MARK_XMIT) > > Okay, I'll take a look. Two quick observations for now: > > - Unfortunately the above patch does not help. > > - In dmesg I see: > > Driver unsupported XDP return value 0 on prog xdp_redirect (id > 23) > dev N/A, expect packet loss! I haven't identified the issue yet, but I have found a couple more things that might be helpful: - In problematic cases metadata contains 0, so this is not an endianness issue. data is still reasonable though. I'm trying to understand what is causing this. - Applying the following diff: --- a/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c +++ b/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c @@ -52,7 +52,7 @@ int xdp_redirect(struct xdp_md *xdp) *payload = MARK_IN; - if (bpf_xdp_adjust_meta(xdp, 4)) + if (false && bpf_xdp_adjust_meta(xdp, 4)) return XDP_ABORTED; if (retcode > XDP_PASS) causes a kernel panic even on x86_64: BUG: kernel NULL pointer dereference, address: 0000000000000d28 ... Call Trace: <TASK> build_skb_around+0x22/0xb0 __xdp_build_skb_from_frame+0x4e/0x130 bpf_test_run_xdp_live+0x65f/0x7c0 ? __pfx_xdp_test_run_init_page+0x10/0x10 bpf_prog_test_run_xdp+0x2ba/0x480 bpf_prog_test_run+0xeb/0x110 __sys_bpf+0x2b9/0x570 __x64_sys_bpf+0x1c/0x30 do_syscall_64+0x48/0xa0 entry_SYSCALL_64_after_hwframe+0x72/0xdc I haven't looked into this at all, but I believe this needs to be fixed - BPF should never cause kernel panics.