Re: [PATCH bpf-next v1 3/3] selftests/bpf: tests for using dynptrs to parse skb and xdp buffers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 1, 2022 at 10:58 AM Andrii Nakryiko
<andrii.nakryiko@xxxxxxxxx> wrote:
>
> On Tue, Jul 26, 2022 at 11:48 AM Joanne Koong <joannelkoong@xxxxxxxxx> wrote:
> >
> > Test skb and xdp dynptr functionality in the following ways:
> >
> > 1) progs/test_xdp.c
> >    * Change existing test to use dynptrs to parse xdp data
> >
> >      There were no noticeable diferences in user + system time between
> >      the original version vs. using dynptrs. Averaging the time for 10
> >      runs (run using "time ./test_progs -t xdp_bpf2bpf"):
> >          original version: 0.0449 sec
> >          with dynptrs: 0.0429 sec
> >
> > 2) progs/test_l4lb_noinline.c
> >    * Change existing test to use dynptrs to parse skb data
> >
> >      There were no noticeable diferences in user + system time between
> >      the original version vs. using dynptrs. Averaging the time for 10
> >      runs (run using "time ./test_progs -t l4lb_all/l4lb_noinline"):
> >          original version: 0.0502 sec
> >          with dynptrs: 0.055 sec
> >
> >      For number of processed verifier instructions:
> >          original version: 6284 insns
> >          with dynptrs: 2538 insns
> >
> > 3) progs/test_dynptr_xdp.c
> >    * Add sample code for parsing tcp hdr opt lookup using dynptrs.
> >      This logic is lifted from a real-world use case of packet parsing in
> >      katran [0], a layer 4 load balancer
> >
> > 4) progs/dynptr_success.c
> >    * Add test case "test_skb_readonly" for testing attempts at writes /
> >      data slices on a prog type with read-only skb ctx.
> >
> > 5) progs/dynptr_fail.c
> >    * Add test cases "skb_invalid_data_slice" and
> >      "xdp_invalid_data_slice" for testing that helpers that modify the
> >      underlying packet buffer automatically invalidate the associated
> >      data slice.
> >    * Add test cases "skb_invalid_ctx" and "xdp_invalid_ctx" for testing
> >      that prog types that do not support bpf_dynptr_from_skb/xdp don't
> >      have access to the API.
> >
> > [0] https://github.com/facebookincubator/katran/blob/main/katran/lib/bpf/pckt_parsing.h
> >
> > Signed-off-by: Joanne Koong <joannelkoong@xxxxxxxxx>
> > ---
> >  .../testing/selftests/bpf/prog_tests/dynptr.c |  85 ++++++++++---
> >  .../selftests/bpf/prog_tests/dynptr_xdp.c     |  49 ++++++++
> >  .../testing/selftests/bpf/progs/dynptr_fail.c |  76 ++++++++++++
> >  .../selftests/bpf/progs/dynptr_success.c      |  32 +++++
> >  .../selftests/bpf/progs/test_dynptr_xdp.c     | 115 ++++++++++++++++++
> >  .../selftests/bpf/progs/test_l4lb_noinline.c  |  71 +++++------
> >  tools/testing/selftests/bpf/progs/test_xdp.c  |  95 +++++++--------
> >  .../selftests/bpf/test_tcp_hdr_options.h      |   1 +
> >  8 files changed, 416 insertions(+), 108 deletions(-)
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/dynptr_xdp.c
> >  create mode 100644 tools/testing/selftests/bpf/progs/test_dynptr_xdp.c
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/dynptr.c b/tools/testing/selftests/bpf/prog_tests/dynptr.c
> > index bcf80b9f7c27..c40631f33c7b 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/dynptr.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/dynptr.c
> > @@ -2,6 +2,7 @@
> >  /* Copyright (c) 2022 Facebook */
> >
> >  #include <test_progs.h>
> > +#include <network_helpers.h>
> >  #include "dynptr_fail.skel.h"
> >  #include "dynptr_success.skel.h"
> >
> > @@ -11,8 +12,8 @@ static char obj_log_buf[1048576];
> >  static struct {
> >         const char *prog_name;
> >         const char *expected_err_msg;
> > -} dynptr_tests[] = {
> > -       /* failure cases */
> > +} verifier_error_tests[] = {
> > +       /* these cases should trigger a verifier error */
> >         {"ringbuf_missing_release1", "Unreleased reference id=1"},
> >         {"ringbuf_missing_release2", "Unreleased reference id=2"},
> >         {"ringbuf_missing_release_callback", "Unreleased reference id"},
> > @@ -42,11 +43,25 @@ static struct {
> >         {"release_twice_callback", "arg 1 is an unacquired reference"},
> >         {"dynptr_from_mem_invalid_api",
> >                 "Unsupported reg type fp for bpf_dynptr_from_mem data"},
> > +       {"skb_invalid_data_slice", "invalid mem access 'scalar'"},
> > +       {"xdp_invalid_data_slice", "invalid mem access 'scalar'"},
> > +       {"skb_invalid_ctx", "unknown func bpf_dynptr_from_skb"},
> > +       {"xdp_invalid_ctx", "unknown func bpf_dynptr_from_xdp"},
> > +};
> > +
> > +enum test_setup_type {
> > +       SETUP_SYSCALL_SLEEP,
> > +       SETUP_SKB_PROG,
> > +};
> >
> > -       /* success cases */
> > -       {"test_read_write", NULL},
> > -       {"test_data_slice", NULL},
> > -       {"test_ringbuf", NULL},
> > +static struct {
> > +       const char *prog_name;
> > +       enum test_setup_type type;
> > +} runtime_tests[] = {
> > +       {"test_read_write", SETUP_SYSCALL_SLEEP},
> > +       {"test_data_slice", SETUP_SYSCALL_SLEEP},
> > +       {"test_ringbuf", SETUP_SYSCALL_SLEEP},
> > +       {"test_skb_readonly", SETUP_SKB_PROG},
>
> nit: wouldn't it be better to add test_setup_type to dynptr_tests (and
> keep fail and success cases together)? It's conceivable that you might
> want different setups to test different error conditions, right?

Yeah! I originally separated it out because the success tests don't
have an error message while the fail ones do, and fail ones don't have
a setup (I don't think we'll need any custom userspace setup for those
since we're checking for verifier failures at prog load time) and the
success ones do. But I can combine them into 1 so that it's simpler. I
will do this in v2.
>
> >  };
> >
> >  static void verify_fail(const char *prog_name, const char *expected_err_msg)
> > @@ -85,7 +100,7 @@ static void verify_fail(const char *prog_name, const char *expected_err_msg)
> >         dynptr_fail__destroy(skel);
> >  }
> >
> > -static void verify_success(const char *prog_name)
> > +static void run_tests(const char *prog_name, enum test_setup_type setup_type)
> >  {
> >         struct dynptr_success *skel;
> >         struct bpf_program *prog;
> > @@ -107,15 +122,42 @@ static void verify_success(const char *prog_name)
> >         if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name"))
> >                 goto cleanup;
> >
> > -       link = bpf_program__attach(prog);
> > -       if (!ASSERT_OK_PTR(link, "bpf_program__attach"))
> > -               goto cleanup;
> > +       switch (setup_type) {
> > +       case SETUP_SYSCALL_SLEEP:
> > +               link = bpf_program__attach(prog);
> > +               if (!ASSERT_OK_PTR(link, "bpf_program__attach"))
> > +                       goto cleanup;
> >
> > -       usleep(1);
> > +               usleep(1);
> >
> > -       ASSERT_EQ(skel->bss->err, 0, "err");
> > +               bpf_link__destroy(link);
> > +               break;
> > +       case SETUP_SKB_PROG:
> > +       {
> > +               int prog_fd, err;
> > +               char buf[64];
> > +
> > +               prog_fd = bpf_program__fd(prog);
> > +               if (CHECK_FAIL(prog_fd < 0))
>
> please don't use CHECK and especially CHECK_FAIL
>
> > +                       goto cleanup;
> > +
> > +               LIBBPF_OPTS(bpf_test_run_opts, topts,
> > +                           .data_in = &pkt_v4,
> > +                           .data_size_in = sizeof(pkt_v4),
> > +                           .data_out = buf,
> > +                           .data_size_out = sizeof(buf),
> > +                           .repeat = 1,
> > +               );
>
> nit: LIBBPF_OPTS declares variable, so should be part of variable
> declaration block
>
> >
> > -       bpf_link__destroy(link);
> > +               err = bpf_prog_test_run_opts(prog_fd, &topts);
> > +
> > +               if (!ASSERT_OK(err, "test_run"))
> > +                       goto cleanup;
> > +
> > +               break;
> > +       }
> > +       }
> > +       ASSERT_EQ(skel->bss->err, 0, "err");
> >
> >  cleanup:
> >         dynptr_success__destroy(skel);
> > @@ -125,14 +167,17 @@ void test_dynptr(void)
> >  {
> >         int i;
> >
> > -       for (i = 0; i < ARRAY_SIZE(dynptr_tests); i++) {
> > -               if (!test__start_subtest(dynptr_tests[i].prog_name))
> > +       for (i = 0; i < ARRAY_SIZE(verifier_error_tests); i++) {
> > +               if (!test__start_subtest(verifier_error_tests[i].prog_name))
> > +                       continue;
> > +
> > +               verify_fail(verifier_error_tests[i].prog_name,
> > +                           verifier_error_tests[i].expected_err_msg);
> > +       }
> > +       for (i = 0; i < ARRAY_SIZE(runtime_tests); i++) {
> > +               if (!test__start_subtest(runtime_tests[i].prog_name))
> >                         continue;
> >
> > -               if (dynptr_tests[i].expected_err_msg)
> > -                       verify_fail(dynptr_tests[i].prog_name,
> > -                                   dynptr_tests[i].expected_err_msg);
> > -               else
> > -                       verify_success(dynptr_tests[i].prog_name);
> > +               run_tests(runtime_tests[i].prog_name, runtime_tests[i].type);
> >         }
> >  }
> > diff --git a/tools/testing/selftests/bpf/prog_tests/dynptr_xdp.c b/tools/testing/selftests/bpf/prog_tests/dynptr_xdp.c
> > new file mode 100644
> > index 000000000000..ca775d126b60
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/prog_tests/dynptr_xdp.c
> > @@ -0,0 +1,49 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include <test_progs.h>
> > +#include <network_helpers.h>
> > +#include "test_dynptr_xdp.skel.h"
> > +#include "test_tcp_hdr_options.h"
> > +
> > +struct test_pkt {
> > +       struct ipv6_packet pk6_v6;
> > +       u8 options[16];
> > +} __packed;
> > +
> > +void test_dynptr_xdp(void)
> > +{
> > +       struct test_dynptr_xdp *skel;
> > +       char buf[128];
> > +       int err;
> > +
> > +       skel = test_dynptr_xdp__open_and_load();
> > +       if (!ASSERT_OK_PTR(skel, "skel_open_and_load"))
> > +               return;
> > +
> > +       struct test_pkt pkt = {
> > +               .pk6_v6.eth.h_proto = __bpf_constant_htons(ETH_P_IPV6),
> > +               .pk6_v6.iph.nexthdr = IPPROTO_TCP,
> > +               .pk6_v6.iph.payload_len = __bpf_constant_htons(MAGIC_BYTES),
> > +               .pk6_v6.tcp.urg_ptr = 123,
> > +               .pk6_v6.tcp.doff = 9, /* 16 bytes of options */
> > +
> > +               .options = {
> > +                       TCPOPT_MSS, 4, 0x05, 0xB4, TCPOPT_NOP, TCPOPT_NOP,
> > +                       skel->rodata->tcp_hdr_opt_kind_tpr, 6, 0, 0, 0, 9, TCPOPT_EOL
> > +               },
> > +       };
> > +
> > +       LIBBPF_OPTS(bpf_test_run_opts, topts,
> > +                   .data_in = &pkt,
> > +                   .data_size_in = sizeof(pkt),
> > +                   .data_out = buf,
> > +                   .data_size_out = sizeof(buf),
> > +                   .repeat = 3,
> > +       );
> > +
>
> for topts and pkt, they should be up above with other variables
> (unless we want to break off from kernel code style, which I didn't
> think we want)
>
> > +       err = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.xdp_ingress_v6), &topts);
> > +       ASSERT_OK(err, "ipv6 test_run");
> > +       ASSERT_EQ(skel->bss->server_id, 0x9000000, "server id");
> > +       ASSERT_EQ(topts.retval, XDP_PASS, "ipv6 test_run retval");
> > +
> > +       test_dynptr_xdp__destroy(skel);
> > +}
> > diff --git a/tools/testing/selftests/bpf/progs/dynptr_fail.c b/tools/testing/selftests/bpf/progs/dynptr_fail.c
> > index c1814938a5fd..4e3f853b2d02 100644
> > --- a/tools/testing/selftests/bpf/progs/dynptr_fail.c
> > +++ b/tools/testing/selftests/bpf/progs/dynptr_fail.c
> > @@ -5,6 +5,7 @@
> >  #include <string.h>
> >  #include <linux/bpf.h>
> >  #include <bpf/bpf_helpers.h>
> > +#include <linux/if_ether.h>
> >  #include "bpf_misc.h"
> >
> >  char _license[] SEC("license") = "GPL";
> > @@ -622,3 +623,78 @@ int dynptr_from_mem_invalid_api(void *ctx)
> >
> >         return 0;
> >  }
> > +
> > +/* The data slice is invalidated whenever a helper changes packet data */
> > +SEC("?tc")
> > +int skb_invalid_data_slice(struct __sk_buff *skb)
> > +{
> > +       struct bpf_dynptr ptr;
> > +       struct ethhdr *hdr;
> > +
> > +       bpf_dynptr_from_skb(skb, 0, &ptr);
> > +       hdr = bpf_dynptr_data(&ptr, 0, sizeof(*hdr));
> > +       if (!hdr)
> > +               return SK_DROP;
> > +
> > +       hdr->h_proto = 12;
> > +
> > +       if (bpf_skb_pull_data(skb, skb->len))
> > +               return SK_DROP;
> > +
> > +       /* this should fail */
> > +       hdr->h_proto = 1;
> > +
> > +       return SK_PASS;
> > +}
> > +
> > +/* The data slice is invalidated whenever a helper changes packet data */
> > +SEC("?xdp")
> > +int xdp_invalid_data_slice(struct xdp_md *xdp)
> > +{
> > +       struct bpf_dynptr ptr;
> > +       struct ethhdr *hdr1, *hdr2;
> > +
> > +       bpf_dynptr_from_xdp(xdp, 0, &ptr);
> > +       hdr1 = bpf_dynptr_data(&ptr, 0, sizeof(*hdr1));
> > +       if (!hdr1)
> > +               return XDP_DROP;
> > +
> > +       hdr2 = bpf_dynptr_data(&ptr, 0, sizeof(*hdr2));
> > +       if (!hdr2)
> > +               return XDP_DROP;
> > +
> > +       hdr1->h_proto = 12;
> > +       hdr2->h_proto = 12;
> > +
> > +       if (bpf_xdp_adjust_head(xdp, 0 - (int)sizeof(*hdr1)))
> > +               return XDP_DROP;
> > +
> > +       /* this should fail */
> > +       hdr2->h_proto = 1;
>
> is there something special about having both hdr1 and hdr2? Wouldn't
> this test work with just single hdr pointer?

Yes, this test would work with just 1 single hdr pointer (which is
what skb_invalid_data_slice does) but I wanted to ensure that this
also works in the case of multiple data slices. If you think this is
unnecessary / adds clutter, I can remove hdr2.

>
> > +
> > +       return XDP_PASS;
> > +}
> > +
> > +/* Only supported prog type can create skb-type dynptrs */
>
> [...]
>
> > +       err = 1;
> > +
> > +       if (bpf_dynptr_from_skb(ctx, 0, &ptr))
> > +               return 0;
> > +       err++;
> > +
> > +       data = bpf_dynptr_data(&ptr, 0, 1);
> > +       if (data)
> > +               /* it's an error if data is not NULL since cgroup skbs
> > +                * are read only
> > +                */
> > +               return 0;
> > +       err++;
> > +
> > +       ret = bpf_dynptr_write(&ptr, 0, write_data, sizeof(write_data), 0);
> > +       /* since cgroup skbs are read only, writes should fail */
> > +       if (ret != -EINVAL)
> > +               return 0;
> > +
> > +       err = 0;
>
> hm, if data is NULL you'll still report success if bpf_dynptr_write
> returns 0 or any other error but -EINVAL... The logic is a bit unclear
> here...
>
If data is NULL and bpf_dynptr_write returns 0 or any other error
besides -EINVAL, I think we report a failure here (err is set to a
non-zero value, which userspace checks against)

> > +
> > +       return 0;
> > +}
> > diff --git a/tools/testing/selftests/bpf/progs/test_dynptr_xdp.c b/tools/testing/selftests/bpf/progs/test_dynptr_xdp.c
> > new file mode 100644
> > index 000000000000..c879dfb6370a
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/progs/test_dynptr_xdp.c
> > @@ -0,0 +1,115 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +/* This logic is lifted from a real-world use case of packet parsing, used in
> > + * the open source library katran, a layer 4 load balancer.
> > + *
> > + * This test demonstrates how to parse packet contents using dynptrs.
> > + *
> > + * https://github.com/facebookincubator/katran/blob/main/katran/lib/bpf/pckt_parsing.h
> > + */
> > +
> > +#include <string.h>
> > +#include <linux/bpf.h>
> > +#include <bpf/bpf_helpers.h>
> > +#include <linux/tcp.h>
> > +#include <stdbool.h>
> > +#include <linux/ipv6.h>
> > +#include <linux/if_ether.h>
> > +#include "test_tcp_hdr_options.h"
> > +
> > +char _license[] SEC("license") = "GPL";
> > +
> > +/* Arbitrarily picked unused value from IANA TCP Option Kind Numbers */
> > +const __u32 tcp_hdr_opt_kind_tpr = 0xB7;
> > +/* Length of the tcp header option */
> > +const __u32 tcp_hdr_opt_len_tpr = 6;
> > +/* maximum number of header options to check to lookup server_id */
> > +const __u32 tcp_hdr_opt_max_opt_checks = 15;
> > +
> > +__u32 server_id;
> > +
> > +static int parse_hdr_opt(struct bpf_dynptr *ptr, __u32 *off, __u8 *hdr_bytes_remaining,
> > +                        __u32 *server_id)
> > +{
> > +       __u8 *tcp_opt, kind, hdr_len;
> > +       __u8 *data;
> > +
> > +       data = bpf_dynptr_data(ptr, *off, sizeof(kind) + sizeof(hdr_len) +
> > +                              sizeof(*server_id));
> > +       if (!data)
> > +               return -1;
> > +
> > +       kind = data[0];
> > +
> > +       if (kind == TCPOPT_EOL)
> > +               return -1;
> > +
> > +       if (kind == TCPOPT_NOP) {
> > +               *off += 1;
> > +               /* continue to the next option */
> > +               *hdr_bytes_remaining -= 1;
> > +
> > +               return 0;
> > +       }
> > +
> > +       if (*hdr_bytes_remaining < 2)
> > +               return -1;
> > +
> > +       hdr_len = data[1];
> > +       if (hdr_len > *hdr_bytes_remaining)
> > +               return -1;
> > +
> > +       if (kind == tcp_hdr_opt_kind_tpr) {
> > +               if (hdr_len != tcp_hdr_opt_len_tpr)
> > +                       return -1;
> > +
> > +               memcpy(server_id, (__u32 *)(data + 2), sizeof(*server_id));
>
> this implicitly relies on compiler inlining memcpy, let's use
> __builtint_memcpy() here instead to set a good example?

Sounds good, I will change this for v2. Should memcpys in bpf progs
always use __builtin_memcpy or is it on a case-by-case basis where if
the size is small enough, then you use it?
>
> > +               return 1;
> > +       }
> > +
> > +       *off += hdr_len;
> > +       *hdr_bytes_remaining -= hdr_len;
> > +
> > +       return 0;
> > +}
> > +
> > +SEC("xdp")
> > +int xdp_ingress_v6(struct xdp_md *xdp)
> > +{
> > +       __u8 hdr_bytes_remaining;
> > +       struct tcphdr *tcp_hdr;
> > +       __u8 tcp_hdr_opt_len;
> > +       int err = 0;
> > +       __u32 off;
> > +
> > +       struct bpf_dynptr ptr;
> > +
> > +       bpf_dynptr_from_xdp(xdp, 0, &ptr);
> > +
> > +       off = sizeof(struct ethhdr) + sizeof(struct ipv6hdr);
> > +
> > +       tcp_hdr = bpf_dynptr_data(&ptr, off, sizeof(*tcp_hdr));
> > +       if (!tcp_hdr)
> > +               return XDP_DROP;
> > +
> > +       tcp_hdr_opt_len = (tcp_hdr->doff * 4) - sizeof(struct tcphdr);
> > +       if (tcp_hdr_opt_len < tcp_hdr_opt_len_tpr)
> > +               return XDP_DROP;
> > +
> > +       hdr_bytes_remaining = tcp_hdr_opt_len;
> > +
> > +       off += sizeof(struct tcphdr);
> > +
> > +       /* max number of bytes of options in tcp header is 40 bytes */
> > +       for (int i = 0; i < tcp_hdr_opt_max_opt_checks; i++) {
> > +               err = parse_hdr_opt(&ptr, &off, &hdr_bytes_remaining, &server_id);
> > +
> > +               if (err || !hdr_bytes_remaining)
> > +                       break;
> > +       }
> > +
> > +       if (!server_id)
> > +               return XDP_DROP;
> > +
> > +       return XDP_PASS;
> > +}
>
> I'm not a networking BPF expert, but the logic of packet parsing here
> looks pretty clean! Would it be possible to also backport original
> code with data and data_end, both for testing but also to be able to
> compare and contrast dynptr vs data/data_end approaches?
>

Yeah, good idea! I'll backport the original code from katran for v2.

>
> > diff --git a/tools/testing/selftests/bpf/progs/test_l4lb_noinline.c b/tools/testing/selftests/bpf/progs/test_l4lb_noinline.c
> > index c8bc0c6947aa..1fef7868ea8b 100644
> > --- a/tools/testing/selftests/bpf/progs/test_l4lb_noinline.c
> > +++ b/tools/testing/selftests/bpf/progs/test_l4lb_noinline.c
> > @@ -230,21 +230,18 @@ static __noinline bool get_packet_dst(struct real_definition **real,
> >         return true;
> >  }
> >
> > -static __noinline int parse_icmpv6(void *data, void *data_end, __u64 off,
> > +static __noinline int parse_icmpv6(struct bpf_dynptr *skb_ptr, __u64 off,
> >                                    struct packet_description *pckt)
> >  {
> >         struct icmp6hdr *icmp_hdr;
> >         struct ipv6hdr *ip6h;
> >
> > -       icmp_hdr = data + off;
> > -       if (icmp_hdr + 1 > data_end)
> > +       icmp_hdr = bpf_dynptr_data(skb_ptr, off, sizeof(*icmp_hdr) + sizeof(*ip6h));
> > +       if (!icmp_hdr)
> >                 return TC_ACT_SHOT;
> >         if (icmp_hdr->icmp6_type != ICMPV6_PKT_TOOBIG)
> >                 return TC_ACT_OK;
>
> previously you can still TC_ACT_OK if it's ICMPV6_PKT_TOOBIG even if
> packet size is < sizeof(*icmp_hdr) + sizeof(*ip6h), which might have
> been a bug, but current logic will enforce that packet is at least
> sizeof(*icmp_hdr) + sizeof(*ip6h). Is that a problem?

Good point, I think it's best to maintain the original behavior. I'll
fix this (and the one below) for v2.

>
> > -       off += sizeof(struct icmp6hdr);
> > -       ip6h = data + off;
> > -       if (ip6h + 1 > data_end)
> > -               return TC_ACT_SHOT;
> > +       ip6h = (struct ipv6hdr *)(icmp_hdr + 1);
> >         pckt->proto = ip6h->nexthdr;
> >         pckt->flags |= F_ICMP;
> >         memcpy(pckt->srcv6, ip6h->daddr.s6_addr32, 16);
> > @@ -252,22 +249,19 @@ static __noinline int parse_icmpv6(void *data, void *data_end, __u64 off,
> >         return TC_ACT_UNSPEC;
> >  }
> >
> > -static __noinline int parse_icmp(void *data, void *data_end, __u64 off,
> > +static __noinline int parse_icmp(struct bpf_dynptr *skb_ptr, __u64 off,
> >                                  struct packet_description *pckt)
> >  {
> >         struct icmphdr *icmp_hdr;
> >         struct iphdr *iph;
> >
> > -       icmp_hdr = data + off;
> > -       if (icmp_hdr + 1 > data_end)
> > +       icmp_hdr = bpf_dynptr_data(skb_ptr, off, sizeof(*icmp_hdr) + sizeof(*iph));
> > +       if (!icmp_hdr)
> >                 return TC_ACT_SHOT;
> >         if (icmp_hdr->type != ICMP_DEST_UNREACH ||
> >             icmp_hdr->code != ICMP_FRAG_NEEDED)
> >                 return TC_ACT_OK;
>
> similarly here, short packets can still be TC_ACT_OK in some
> circumstances, while with dynptr they will be shot down early on. Not
> saying this is wrong or bad, just bringing this up for you and others
> to chime in if it's an ok change
>
> > -       off += sizeof(struct icmphdr);
> > -       iph = data + off;
> > -       if (iph + 1 > data_end)
> > -               return TC_ACT_SHOT;
> > +       iph = (struct iphdr *)(icmp_hdr + 1);
> >         if (iph->ihl != 5)
> >                 return TC_ACT_SHOT;
> >         pckt->proto = iph->protocol;
> > @@ -277,13 +271,13 @@ static __noinline int parse_icmp(void *data, void *data_end, __u64 off,
> >         return TC_ACT_UNSPEC;
> >  }
>
> [...]
>
> > -static __always_inline int handle_ipv4(struct xdp_md *xdp)
> > +static __always_inline int handle_ipv4(struct xdp_md *xdp, struct bpf_dynptr *xdp_ptr)
> >  {
> > -       void *data_end = (void *)(long)xdp->data_end;
> > -       void *data = (void *)(long)xdp->data;
> > +       struct bpf_dynptr new_xdp_ptr;
> >         struct iptnl_info *tnl;
> >         struct ethhdr *new_eth;
> >         struct ethhdr *old_eth;
> > -       struct iphdr *iph = data + sizeof(struct ethhdr);
> > +       struct iphdr *iph;
> >         __u16 *next_iph;
> >         __u16 payload_len;
> >         struct vip vip = {};
> > @@ -90,10 +90,12 @@ static __always_inline int handle_ipv4(struct xdp_md *xdp)
> >         __u32 csum = 0;
> >         int i;
> >
> > -       if (iph + 1 > data_end)
> > +       iph = bpf_dynptr_data(xdp_ptr, ethhdr_sz,
> > +                             iphdr_sz + (tcphdr_sz > udphdr_sz ? tcphdr_sz : udphdr_sz));
>
> tcphdr_sz (20) is always bigger than udphdr_sz (8), so just use the
> bigger one here? Though again, for UDP packet it might be a bit too
> pessimistic to reject small packets?

Yeah, maybe the best way here is to first check that if data_end -
data is less than the size of the ethhdr + iphdr + tcphdr, then we
must use udphdr size, otherwise we can use tcphdr size. I'll change it
(and the one below) to this for v2

>
> > +       if (!iph)
> >                 return XDP_DROP;
> >
> > -       dport = get_dport(iph + 1, data_end, iph->protocol);
> > +       dport = get_dport(iph + 1, iph->protocol);
> >         if (dport == -1)
> >                 return XDP_DROP;
>
> [...]
>
> > -static __always_inline int handle_ipv6(struct xdp_md *xdp)
> > +static __always_inline int handle_ipv6(struct xdp_md *xdp, struct bpf_dynptr *xdp_ptr)
> >  {
> > -       void *data_end = (void *)(long)xdp->data_end;
> > -       void *data = (void *)(long)xdp->data;
> > +       struct bpf_dynptr new_xdp_ptr;
> >         struct iptnl_info *tnl;
> >         struct ethhdr *new_eth;
> >         struct ethhdr *old_eth;
> > -       struct ipv6hdr *ip6h = data + sizeof(struct ethhdr);
> > +       struct ipv6hdr *ip6h;
> >         __u16 payload_len;
> >         struct vip vip = {};
> >         int dport;
> >
> > -       if (ip6h + 1 > data_end)
> > +       ip6h = bpf_dynptr_data(xdp_ptr, ethhdr_sz,
> > +                              ipv6hdr_sz + (tcphdr_sz > udphdr_sz ? tcphdr_sz : udphdr_sz));
>
> ditto, there is no dynamism here, verifier actually enforces that this
> value is statically known, I think this example will create false
> assumptions if written this way
>
> > +       if (!ip6h)
> >                 return XDP_DROP;
> >
> > -       dport = get_dport(ip6h + 1, data_end, ip6h->nexthdr);
> > +       dport = get_dport(ip6h + 1, ip6h->nexthdr);
> >         if (dport == -1)
> >                 return XDP_DROP;
> >
>
> [...]



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux