Re: [PATCH net-next v4 1/6] net: ethtool: allow symmetric-xor RSS hash for any flow type

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2023-10-16 16:15, Alexander Duyck wrote:
On Mon, Oct 16, 2023 at 2:09 PM Ahmed Zaki <ahmed.zaki@xxxxxxxxx> wrote:



On 2023-10-16 14:17, Alexander H Duyck wrote:
On Mon, 2023-10-16 at 09:49 -0600, Ahmed Zaki wrote:
Symmetric RSS hash functions are beneficial in applications that monitor
both Tx and Rx packets of the same flow (IDS, software firewalls, ..etc).
Getting all traffic of the same flow on the same RX queue results in
higher CPU cache efficiency.

A NIC that supports "symmetric-xor" can achieve this RSS hash symmetry
by XORing the source and destination fields and pass the values to the
RSS hash algorithm.

Only fields that has counterparts in the other direction can be
accepted; IP src/dst and L4 src/dst ports.

The user may request RSS hash symmetry for a specific flow type, via:

      # ethtool -N|-U eth0 rx-flow-hash <flow_type> s|d|f|n symmetric-xor

or turn symmetry off (asymmetric) by:

      # ethtool -N|-U eth0 rx-flow-hash <flow_type> s|d|f|n

Reviewed-by: Wojciech Drewek <wojciech.drewek@xxxxxxxxx>
Signed-off-by: Ahmed Zaki <ahmed.zaki@xxxxxxxxx>
---
   Documentation/networking/scaling.rst |  6 ++++++
   include/uapi/linux/ethtool.h         | 21 +++++++++++++--------
   net/ethtool/ioctl.c                  | 11 +++++++++++
   3 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/Documentation/networking/scaling.rst b/Documentation/networking/scaling.rst
index 92c9fb46d6a2..64f3d7566407 100644
--- a/Documentation/networking/scaling.rst
+++ b/Documentation/networking/scaling.rst
@@ -44,6 +44,12 @@ by masking out the low order seven bits of the computed hash for the
   packet (usually a Toeplitz hash), taking this number as a key into the
   indirection table and reading the corresponding value.

+Some NICs support symmetric RSS hashing where, if the IP (source address,
+destination address) and TCP/UDP (source port, destination port) tuples
+are swapped, the computed hash is the same. This is beneficial in some
+applications that monitor TCP/IP flows (IDS, firewalls, ...etc) and need
+both directions of the flow to land on the same Rx queue (and CPU).
+
   Some advanced NICs allow steering packets to queues based on
   programmable filters. For example, webserver bound TCP port 80 packets
   can be directed to their own receive queue. Such “n-tuple” filters can
diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index f7fba0dc87e5..4e8d38fb55ce 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -2018,14 +2018,19 @@ static inline int ethtool_validate_duplex(__u8 duplex)
   #define    FLOW_RSS        0x20000000

   /* L3-L4 network traffic flow hash options */
-#define     RXH_L2DA        (1 << 1)
-#define     RXH_VLAN        (1 << 2)
-#define     RXH_L3_PROTO    (1 << 3)
-#define     RXH_IP_SRC      (1 << 4)
-#define     RXH_IP_DST      (1 << 5)
-#define     RXH_L4_B_0_1    (1 << 6) /* src port in case of TCP/UDP/SCTP */
-#define     RXH_L4_B_2_3    (1 << 7) /* dst port in case of TCP/UDP/SCTP */
-#define     RXH_DISCARD     (1 << 31)
+#define     RXH_L2DA                (1 << 1)
+#define     RXH_VLAN                (1 << 2)
+#define     RXH_L3_PROTO            (1 << 3)
+#define     RXH_IP_SRC              (1 << 4)
+#define     RXH_IP_DST              (1 << 5)
+#define     RXH_L4_B_0_1            (1 << 6) /* src port in case of TCP/UDP/SCTP */
+#define     RXH_L4_B_2_3            (1 << 7) /* dst port in case of TCP/UDP/SCTP */
+/* XOR the corresponding source and destination fields of each specified
+ * protocol. Both copies of the XOR'ed fields are fed into the RSS and RXHASH
+ * calculation.
+ */
+#define     RXH_SYMMETRIC_XOR       (1 << 30)
+#define     RXH_DISCARD             (1 << 31)

I guess this has already been discussed but I am not a fan of long
names for defines. I would prefer to see this just be something like
RXH_SYMMETRIC or something like that. The XOR is just an implementation
detail. I have seen the same thing accomplished by just reordering the
fields by min/max approaches.

Correct. We discussed this and the consensus was that the user needs to
have complete control on which implementation/algorithm is used to
provide this symmetry, because each will yield different hash and may be
different performance.

I agree about the user having control over the algorithm, but this
interface isn't about selecting the algorithm. It is just about
setting up the inputs. Selecting the algorithm is handled via the
set/get_rxfh interface hfunc variable. If this is just a different
hash function it really belongs there rather than being made a part of
the input string.

My bad. It is the same RSS algorithm (Toeplitz in our case). Still the user needs to be able to manipulate the inputs. The point is, a generic define like "RXH_SYMETRIC" was rejected (that was actually v1).





   #define    RX_CLS_FLOW_DISC        0xffffffffffffffffULL
   #define RX_CLS_FLOW_WAKE   0xfffffffffffffffeULL
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index 0b0ce4f81c01..b1bd0d4b48e8 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -980,6 +980,17 @@ static noinline_for_stack int ethtool_set_rxnfc(struct net_device *dev,
      if (rc)
              return rc;

+    /* If a symmetric hash is requested, then:
+     * 1 - no other fields besides IP src/dst and/or L4 src/dst
+     * 2 - If src is set, dst must also be set
+     */
+    if ((info.data & RXH_SYMMETRIC_XOR) &&
+        ((info.data & ~(RXH_SYMMETRIC_XOR | RXH_IP_SRC | RXH_IP_DST |
+          RXH_L4_B_0_1 | RXH_L4_B_2_3)) ||
+         (!!(info.data & RXH_IP_SRC) ^ !!(info.data & RXH_IP_DST)) ||
+         (!!(info.data & RXH_L4_B_0_1) ^ !!(info.data & RXH_L4_B_2_3))))
+            return -EINVAL;
+
      rc = dev->ethtool_ops->set_rxnfc(dev, &info);
      if (rc)
              return rc;

You are pushing implementation from your device into the interface
design here. You should probably push these requirements down into the
driver rather than making it a part of the generic implementation.

This is the most basic check and should be applied in any symmetric RSS
implementation. Nothing specific to the XOR method. It can also be
extended to include other "RXH_SYMMETRIC_XXX" in the future.

You are partially correct. Your item 2 is accurate, however you are
excluding other fields in your item 1. Fields such as L2DA wouldn't be
symmetric, but VLAN and L3_PROTO would be. That is the implementation
specific detail I was referring to.

hmm.. not sure how VLAN tag would be used in this case. But moving this into ice_ethtool is trivial. We can start there and unify when/if other vendors push similar functionalities.

How does that sound?




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux