Re: [xdp-hints] Re: [PATCH RFCv2 bpf-next 04/18] net: create xdp_hints_common and set functions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 09/09/2022 12.49, Burakov, Anatoly wrote:
On 07-Sep-22 4:45 PM, Jesper Dangaard Brouer wrote:
XDP-hints via BTF are about giving drivers the ability to extend the
common set of hardware offload hints in a flexible way.

This patch start out with defining the common set, based on what is
used available in the SKB. Having this as a common struct in core
vmlinux makes it easier to implement xdp_frame to SKB conversion
routines as normal C-code, see later patches.

Drivers can redefine the layout of the entire metadata area, but are
encouraged to use this common struct as the base, on which they can
extend on top for their extra hardware offload hints. When doing so,
drivers can mark the xdp_buff (and xdp_frame) with flags indicating
this it compatible with the common struct.

Patch also provides XDP-hints driver helper functions for updating the
common struct. Helpers gets inlined and are defined for maximum
performance, which does require some extra care in drivers, e.g. to
keep track of flags to reduce data dependencies, see code DOC.

Userspace and BPF-prog's MUST not consider the common struct UAPI.
The common struct (and enum flags) are only exposed via BTF, which
implies consumers must read and decode this BTF before using/consuming
data layout.

Signed-off-by: Jesper Dangaard Brouer <brouer@xxxxxxxxxx>
---
  include/net/xdp.h |  147 +++++++++++++++++++++++++++++++++++++++++++++++++++++
  net/core/xdp.c    |    5 ++
  2 files changed, 152 insertions(+)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index 04c852c7a77f..ea5836ccee82 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -8,6 +8,151 @@
  #include <linux/skbuff.h> /* skb_shared_info */
+/**
+ * struct xdp_hints_common - Common XDP-hints offloads shared with netstack
+ * @btf_full_id: The modules BTF object + type ID for specific struct
+ * @vlan_tci: Hardware provided VLAN tag + proto type in @xdp_hints_flags
+ * @rx_hash32: Hardware provided RSS hash value
+ * @xdp_hints_flags: see &enum xdp_hints_flags
+ *
+ * This structure contains the most commonly used hardware offloads hints
+ * provided by NIC drivers and supported by the SKB.
+ *
+ * Driver are expected to extend this structure by include &struct
+ * xdp_hints_common as part of the drivers own specific xdp_hints struct's, but
+ * at the end-of their struct given XDP metadata area grows backwards.
+ *
+ * The member @btf_full_id is populated by driver modules to uniquely identify + * the BTF struct.  The high 32-bits store the modules BTF object ID and the
+ * lower 32-bit the BTF type ID within that BTF object.
+ */
+struct xdp_hints_common {
+    union {
+        __wsum        csum;
+        struct {
+            __u16    csum_start;
+            __u16    csum_offset;
+        };
+    };
+    u16 rx_queue;
+    u16 vlan_tci;
+    u32 rx_hash32;
+    u32 xdp_hints_flags;
+    u64 btf_full_id; /* BTF object + type ID */
+} __attribute__((aligned(4))) __attribute__((packed));

I'm assuming any Tx metadata will have to go before the Rx checksum union?


Nope.  The plan is that the TX metadata can reuse the same metadata area
with its own layout.  I imagine a new xdp_buff->flags bit that tell us
the layout is now TX-layout with xdp_hints_common_tx.

We could rename xdp_hints_common to xdp_hints_common_rx to anticipate
and prepare for this. But that would be getting a head of ourselves,
because someone in the community might have a smarter solution, e.g.
that could combine common RX and TX in a single struct. e.g. overlapping
csum and vlan_tci might make sense.

+
+
+/**
+ * enum xdp_hints_flags - flags used by &struct xdp_hints_common
+ *
+ * The &enum xdp_hints_flags have reserved the first 16 bits for common flags
+ * and drivers can introduce use their own flags bits from BIT(16). For
+ * BPF-progs to find these flags (via BTF) drivers should define an enum
+ * xdp_hints_flags_driver.
+ */
+enum xdp_hints_flags {
+    HINT_FLAG_CSUM_TYPE_BIT0  = BIT(0),
+    HINT_FLAG_CSUM_TYPE_BIT1  = BIT(1),
+    HINT_FLAG_CSUM_TYPE_MASK  = 0x3,
+
+    HINT_FLAG_CSUM_LEVEL_BIT0 = BIT(2),
+    HINT_FLAG_CSUM_LEVEL_BIT1 = BIT(3),
+    HINT_FLAG_CSUM_LEVEL_MASK = 0xC,
+    HINT_FLAG_CSUM_LEVEL_SHIFT = 2,
+
+    HINT_FLAG_RX_HASH_TYPE_BIT0 = BIT(4),
+    HINT_FLAG_RX_HASH_TYPE_BIT1 = BIT(5),
+    HINT_FLAG_RX_HASH_TYPE_MASK = 0x30,
+    HINT_FLAG_RX_HASH_TYPE_SHIFT = 0x4,
+
+    HINT_FLAG_RX_QUEUE = BIT(7),
+
+    HINT_FLAG_VLAN_PRESENT            = BIT(8),
+    HINT_FLAG_VLAN_PROTO_ETH_P_8021Q  = BIT(9),
+    HINT_FLAG_VLAN_PROTO_ETH_P_8021AD = BIT(10),
+    /* Flags from BIT(16) can be used by drivers */

If we assumed we also have Tx section, would 16 bits be enough? For a basic implementation of UDP checksumming, AF_XDP would need 3x16 more bits (to store L2/L3/L4 offsets) plus probably a flag field indicating presence of each. Is there any way to expand common fields in the future (or is it at all intended to be expandable)?


As above we could have separate flags for TX side, e.g.
xdp_hints_flags_tx.  But some of the flags might still be valid for
TX-side, so they could potentially share some.

BUT it is also important to realize that I'm saying this is not UAPI
flags being exposed (like in include/uapi/bpf.h).  The runtime value of
these enum defined flags MUST be obtained via BTF (through help of
libbpf CO-RE or in userspace by parsing BTF).

Thus, in principle the kernel is free to change these structs and enums.
In practice it will be very annoying for BPF-progs and AF_XDP userspace
code if we change the names of the struct's and somewhat annoying if
members change name.  CO-RE can deal with kernel changes and feature
detection[1] down to the avail enums e.g. via using
bpf_core_enum_value_exists().  But we should avoid too many changes as
the code becomes harder to read.

--Jesper

[1] https://nakryiko.com/posts/bpf-core-reference-guide/#bpf-core-enum-value-exists




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux