Re: [PATCH bpf-next V2 3/6] selftests/bpf: xdp_hw_metadata RX hash return code info

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 03/24, Jesper Dangaard Brouer wrote:


On 23/03/2023 18.47, Stanislav Fomichev wrote:
> On Thu, Mar 23, 2023 at 10:35 AM Alexei Starovoitov
> <alexei.starovoitov@xxxxxxxxx> wrote:
> >
> > On Thu, Mar 23, 2023 at 1:51 AM Jesper Dangaard Brouer
> > <jbrouer@xxxxxxxxxx> wrote:
> > >
> > >
> > > On 22/03/2023 20.33, Stanislav Fomichev wrote:
> > > > On Wed, Mar 22, 2023 at 12:30 PM Alexei Starovoitov
> > > > <alexei.starovoitov@xxxxxxxxx> wrote:
> > > > >
> > > > > On Wed, Mar 22, 2023 at 12:23 PM Stanislav Fomichev <sdf@xxxxxxxxxx> wrote:
> > > > > >
> > > > > > On Wed, Mar 22, 2023 at 12:17 PM Alexei Starovoitov
> > > > > > <alexei.starovoitov@xxxxxxxxx> wrote:
> > > > > > >
> > > > > > > On Wed, Mar 22, 2023 at 12:00 PM Stanislav Fomichev <sdf@xxxxxxxxxx> wrote:
> > > > > > > >
> > > > > > > > On Wed, Mar 22, 2023 at 9:07 AM Alexei Starovoitov
> > > > > > > > <alexei.starovoitov@xxxxxxxxx> wrote:
> > > > > > > > >
> > > > > > > > > On Wed, Mar 22, 2023 at 9:05 AM Jesper Dangaard Brouer
> > > > > > > > > <jbrouer@xxxxxxxxxx> wrote:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On 21/03/2023 19.47, Stanislav Fomichev wrote:
> > > > > > > > > > > On Tue, Mar 21, 2023 at 6:47 AM Jesper Dangaard Brouer
> > > > > > > > > > > <brouer@xxxxxxxxxx> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > When driver developers add XDP-hints kfuncs for RX hash it is > > > > > > > > > > > > practical to print the return code in bpf_printk trace pipe log.
> > > > > > > > > > > >
> > > > > > > > > > > > Print hash value as a hex value, both AF_XDP userspace and bpf_prog, > > > > > > > > > > > > as this makes it easier to spot poor quality hashes.
> > > > > > > > > > > >
> > > > > > > > > > > > Signed-off-by: Jesper Dangaard Brouer <brouer@xxxxxxxxxx>
> > > > > > > > > > > > ---
> > > > > > > > > > > > .../testing/selftests/bpf/progs/xdp_hw_metadata.c | 9 ++++++--- > > > > > > > > > > > > tools/testing/selftests/bpf/xdp_hw_metadata.c | 5 ++++- > > > > > > > > > > > > 2 files changed, 10 insertions(+), 4 deletions(-)
> > > > > > > > > > > >
> > > > > > > > > > > > diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > > > > > > > > > > index 40c17adbf483..ce07010e4d48 100644
> > > > > > > > > > > > --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c > > > > > > > > > > > > +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > > > > > > > > > > @@ -77,10 +77,13 @@ int rx(struct xdp_md *ctx)
> > > > > > > > > > > > meta->rx_timestamp = 0; /* Used by AF_XDP as not avail signal */
> > > > > > > > > > > >            }
> > > > > > > > > > > >
> > > > > > > > > > > > - if (!bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash)) > > > > > > > > > > > > - bpf_printk("populated rx_hash with %u", meta->rx_hash);
> > > > > > > > > > > > -       else
> > > > > > > > > > > > + ret = bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash);
> > > > > > > > > > > > +       if (ret >= 0) {
> > > > > > > > > > > > + bpf_printk("populated rx_hash with 0x%08X", meta->rx_hash);
> > > > > > > > > > > > +       } else {
> > > > > > > > > > > > + bpf_printk("rx_hash not-avail errno:%d", ret); > > > > > > > > > > > > meta->rx_hash = 0; /* Used by AF_XDP as not avail signal */
> > > > > > > > > > > > +       }
> > > > > > > > > > > >
> > > > > > > > > > > > return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS);
> > > > > > > > > > > >     }
> > > > > > > > > > > > diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > > > > > > > > > > index 400bfe19abfe..f3ec07ccdc95 100644
> > > > > > > > > > > > --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c > > > > > > > > > > > > +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > > > > > > > > > > @@ -3,6 +3,9 @@
> > > > > > > > > > > > /* Reference program for verifying XDP metadata on real HW. Functional test
> > > > > > > > > > > >      * only, doesn't test the performance.
> > > > > > > > > > > >      *
> > > > > > > > > > > > + * BPF-prog bpf_printk info outout can be access via
> > > > > > > > > > > > + * /sys/kernel/debug/tracing/trace_pipe
> > > > > > > > > > >
> > > > > > > > > > > s/outout/output/
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Fixed in V3
> > > > > > > > > >
> > > > > > > > > > > But let's maybe drop it? If you want to make it more usable, let's > > > > > > > > > > > have a separate patch to enable tracing and periodically dump it to
> > > > > > > > > > > the console instead (as previously discussed).
> > > > > > > > > >
> > > > > > > > > > Cat'ing /sys/kernel/debug/tracing/trace_pipe work for me regardless of
> > > > > > > > > > setting in
> > > > > > > > > > /sys/kernel/debug/tracing/events/bpf_trace/bpf_trace_printk/enable
> > > > > > > > > >
> > > > > > > > > > We likely need a followup patch that adds a BPF config switch that can > > > > > > > > > > disable bpf_printk calls, because this adds overhead and thus affects
> > > > > > > > > > the timestamps.
> > > > > > > > >
> > > > > > > > > No. This is by design.
> > > > > > > > > Do not use bpf_printk* in production.
> > >
> > > I fully agree do not use bpf_printk in *production*.
> > >
> > > > > > > >
> > > > > > > > But that's not for the production? xdp_hw_metadata is a small tool to > > > > > > > > verify that the metadata being dumped is correct (during the
> > > > > > > > development).
> > > > > > > > We have a proper (less verbose) selftest in
> > > > > > > > {progs,prog_tests}/xdp_metadata.c (over veth).
> > > > > > > > This xdp_hw_metadata was supposed to be used for running it against > > > > > > > > the real hardware, so having as much debugging at hand as possible > > > > > > > > seems helpful? (at least it was helpful to me when playing with mlx4)
> > >
> > > My experience when developing these kfuncs for igc (real hardware), this > > > "tool" xdp_hw_metadata was super helpful, because it was very verbose > > > (and I was juggling reading chip registers BE/LE and see patch1 a buggy
> > > implementation for RX-hash).
> > >
> > > As I wrote in cover-letter, I recommend other driver developers to do
> > > the same, because it really help speed up the development. In theory
> > > xdp_hw_metadata doesn't belong in selftests directory and IMHO it should > > > have been placed in samples/bpf/, but given the relationship with real
> > > selftest {progs,prog_tests}/xdp_metadata.c I think it makes sense to
> > > keep here.
> > >
> > >
> > > > > > >
> > > > > > > The only use of bpf_printk is for debugging of bpf progs themselves.
> > > > > > > It should not be used in any tool.
> > > > > >
> > > > > > Hmm, good point. I guess it also means we won't have to mess with > > > > > > enabling/dumping ftrace (and don't need this comment about cat'ing the
> > > > > > file).
> > > > > > Jesper, maybe we can instead pass the status of those
> > > > > > bpf_xdp_metadata_xxx kfuncs via 'struct xdp_meta'? And dump this info
> > > > > > from the userspace if needed.
> > > > >
> > > > > There are so many other ways for bpf prog to communicate with user space.
> > > > > Use ringbuf, perf_event buffer, global vars, maps, etc.
> > > > > trace_pipe is debug only because it's global and will conflict with
> > > > > all other debug sessions.
> > >
> > > I want to highlight above paragraph: It is very true for production
> > > code. (Anyone Googling this pay attention to above paragraph).
> > >
> > > >
> > > > 👍 makes sense, ty! hopefully we won't have to add a separate channel
> > > > for those and can (ab)use the metadata area.
> > > >
> > >
> > > Proposed solution: How about default disabling the bpf_printk's via a
> > > macro define, and then driver developer can manually reenable them
> > > easily via a single define, to enable a debugging mode.
> > >
> > > I was only watching /sys/kernel/debug/tracing/trace_pipe when I was
> > > debugging a driver issue.  Thus, an extra step of modifying a single
> > > define in BPF seems easier, than instrumenting my driver with printk.
> >
> > It's certainly fine to have commented out bpf_printk in selftests
> > and sample code.
> > But the patch does:
> > +       if (ret >= 0) {
> > + bpf_printk("populated rx_hash with 0x%08X", meta->rx_hash);
> > +       } else {
> > +               bpf_printk("rx_hash not-avail errno:%d", ret);
> > meta->rx_hash = 0; /* Used by AF_XDP as not avail signal */
> > +       }
> >
> > It feels that printk is the only thing that provides the signal to the user.
> > Doing s/ret >= 0/true/ won't make any difference to selftest and
> > to the consumer and that's my main concern with such selftest/samples.
>
> I agree, having this signal delivered to the user (without the ifdefs)
> seems like a better way to go.

I agree that we have a signal that we are not delivering to the user.

Just so we are on the same page, let me explain how above code is
problematic. As the rx_hash value zero have two meanings:

  (1) Negative 'ret' set rx_hash=0 as err signal to AF_XDP "user"
  (2) Hardware set rx_hash=0, when hash-type == IGC_RSS_TYPE_NO_HASH

Case (2) happens for all L2 packets e.g. ARP packets.  See in patch-1
where I map IGC_RSS_TYPE_NO_HASH to netstack type PKT_HASH_TYPE_L2.
I did consider return errno/negative number for IGC_RSS_TYPE_NO_HASH,
but I wanted to keep kfunc code as simple as possible (both for speed
and if we need to unroll as byte-code later). As i225 hardware still
writes zero into hash word I choose to keep code simple.


IMHO this symptom is related to an API problem of the kfunc, that
doesn't provide the hash-type.

> Seems trivial to do something like the following below? (untested,
> doesn't compile, gmail-copy-pasted so don't try to apply it)
>

If the kfunc provided the hash-type, which will be a positive number.
Then I would add a signed integer to meta, which can store the hash-type
or the negative errno number.

What's wrong with storing the return value from the kfuncs as is?
We are not writing super optimized production code here, so should be
fine?

> diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> index 4c55b4d79d3d..061c410f68ea 100644
> --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> @@ -12,6 +12,9 @@ struct {
>    __type(value, __u32);
>   } xsk SEC(".maps");
>
> +int received;
> +int dropped;
> +
>   extern int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx,
>    __u64 *timestamp) __ksym;
>   extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx,
> @@ -52,11 +55,11 @@ int rx(struct xdp_md *ctx)
>    if (udp->dest != bpf_htons(9091))
>    return XDP_PASS;

It we wanted to make this program user "friendly", we should also count
packets that doesn't get redirected to AF_XDP "user" and instead skipped.

Sure, let's do that?

> - bpf_printk("forwarding UDP:9091 to AF_XDP");
> + __sync_fetch_and_add(&received, 1);
>
>    ret = bpf_xdp_adjust_meta(ctx, -(int)sizeof(struct xdp_meta));
>    if (ret != 0) {
> - bpf_printk("bpf_xdp_adjust_meta returned %d", ret);
> + __sync_fetch_and_add(&dropped, 1);
>    return XDP_PASS;

Is it a "dropped" or a "skipped" packet (return XDP_PASS)?

Up to you. I was mostly throwing an idea on how to drop these bpf_printk
completely and report those events from the userspace.

>    }
[...]

--Jesper





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux