The tcp-data-split-thresh option configures the threshold value of the tcp-data-split. If a received packet size is larger than this threshold value, a packet will be split into header and payload. The header indicates TCP header, but it depends on driver spec. The bnxt_en driver supports HDS(Header-Data-Split) configuration at FW level, affecting TCP and UDP too. So, like the tcp-data-split option, If tcp-data-split-thresh is set, it affects UDP and TCP packets. The tcp-data-split-thresh has a dependency, that is tcp-data-split option. This threshold value can be get/set only when tcp-data-split option is enabled. Example: # ethtool -G <interface name> tcp-data-split-thresh <value> # ethtool -G enp14s0f0np0 tcp-data-split on tcp-data-split-thresh 256 # ethtool -g enp14s0f0np0 Ring parameters for enp14s0f0np0: Pre-set maximums: ... TCP data split thresh: 256 Current hardware settings: ... TCP data split: on TCP data split thresh: 256 The tcp-data-split is not enabled, the tcp-data-split-thresh will not be used and can't be configured. # ethtool -G enp14s0f0np0 tcp-data-split off # ethtool -g enp14s0f0np0 Ring parameters for enp14s0f0np0: Pre-set maximums: ... TCP data split thresh: 256 Current hardware settings: ... TCP data split: off TCP data split thresh: n/a The default/min/max values are not defined in the ethtool so the drivers should define themself. The 0 value means that all TCP and UDP packets' header and payload will be split. Users should consider the overhead due to this feature. Signed-off-by: Taehee Yoo <ap420073@xxxxxxxxx> --- v3: - Fix documentation and ynl - Update error messages - Validate configuration of tcp-data-split and tcp-data-split-thresh v2: - Patch added. Documentation/netlink/specs/ethtool.yaml | 8 +++ Documentation/networking/ethtool-netlink.rst | 75 ++++++++++++-------- include/linux/ethtool.h | 4 ++ include/uapi/linux/ethtool_netlink.h | 2 + net/ethtool/netlink.h | 2 +- net/ethtool/rings.c | 46 ++++++++++-- 6 files changed, 102 insertions(+), 35 deletions(-) diff --git a/Documentation/netlink/specs/ethtool.yaml b/Documentation/netlink/specs/ethtool.yaml index 6a050d755b9c..96298fe5ed43 100644 --- a/Documentation/netlink/specs/ethtool.yaml +++ b/Documentation/netlink/specs/ethtool.yaml @@ -215,6 +215,12 @@ attribute-sets: - name: tx-push-buf-len-max type: u32 + - + name: tcp-data-split-thresh + type: u32 + - + name: tcp-data-split-thresh-max + type: u32 - name: mm-stat @@ -1393,6 +1399,8 @@ operations: - rx-push - tx-push-buf-len - tx-push-buf-len-max + - tcp-data-split-thresh + - tcp-data-split-thresh-max dump: *ring-get-op - name: rings-set diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst index 295563e91082..f0cd918dbe7e 100644 --- a/Documentation/networking/ethtool-netlink.rst +++ b/Documentation/networking/ethtool-netlink.rst @@ -875,24 +875,32 @@ Request contents: Kernel response contents: - ======================================= ====== =========================== - ``ETHTOOL_A_RINGS_HEADER`` nested reply header - ``ETHTOOL_A_RINGS_RX_MAX`` u32 max size of RX ring - ``ETHTOOL_A_RINGS_RX_MINI_MAX`` u32 max size of RX mini ring - ``ETHTOOL_A_RINGS_RX_JUMBO_MAX`` u32 max size of RX jumbo ring - ``ETHTOOL_A_RINGS_TX_MAX`` u32 max size of TX ring - ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring - ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring - ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring - ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring - ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the ring - ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` u8 TCP header / data split - ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE - ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode - ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode - ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push buffer - ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX`` u32 max size of TX push buffer - ======================================= ====== =========================== + ============================================= ====== ======================= + ``ETHTOOL_A_RINGS_HEADER`` nested reply header + ``ETHTOOL_A_RINGS_RX_MAX`` u32 max size of RX ring + ``ETHTOOL_A_RINGS_RX_MINI_MAX`` u32 max size of RX mini + ring + ``ETHTOOL_A_RINGS_RX_JUMBO_MAX`` u32 max size of RX jumbo + ring + ``ETHTOOL_A_RINGS_TX_MAX`` u32 max size of TX ring + ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring + ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring + ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring + ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring + ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the + ring + ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` u8 TCP header / data split + ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE + ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode + ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode + ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push buffer + ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX`` u32 max size of TX push + buffer + ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH`` u32 threshold of + TCP header / data split + ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH_MAX`` u32 max threshold of + TCP header / data split + ============================================= ====== ======================= ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` indicates whether the device is usable with page-flipping TCP zero-copy receive (``getsockopt(TCP_ZEROCOPY_RECEIVE)``). @@ -927,18 +935,21 @@ Sets ring sizes like ``ETHTOOL_SRINGPARAM`` ioctl request. Request contents: - ==================================== ====== =========================== - ``ETHTOOL_A_RINGS_HEADER`` nested reply header - ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring - ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring - ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring - ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring - ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the ring - ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE - ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode - ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode - ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push buffer - ==================================== ====== =========================== + ========================================= ====== ======================= + ``ETHTOOL_A_RINGS_HEADER`` nested reply header + ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring + ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring + ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring + ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring + ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the ring + ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` u8 TCP header / data split + ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE + ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode + ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode + ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push buffer + ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH`` u32 threshold of + TCP header / data split + ========================================= ====== ======================= Kernel checks that requested ring sizes do not exceed limits reported by driver. Driver may impose additional constraints and may not support all @@ -954,6 +965,10 @@ A bigger CQE can have more receive buffer pointers, and in turn the NIC can transfer a bigger frame from wire. Based on the NIC hardware, the overall completion queue size can be adjusted in the driver if CQE size is modified. +``ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH`` specifies the threshold value of +tcp data split feature. If tcp-data-split is enabled and a received packet +size is larger than this threshold value, header and data will be split. + CHANNELS_GET ============ diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h index 12f6dc567598..891f55b0f6aa 100644 --- a/include/linux/ethtool.h +++ b/include/linux/ethtool.h @@ -78,6 +78,8 @@ enum { * @cqe_size: Size of TX/RX completion queue event * @tx_push_buf_len: Size of TX push buffer * @tx_push_buf_max_len: Maximum allowed size of TX push buffer + * @tcp_data_split_thresh: Threshold value of tcp-data-split + * @tcp_data_split_thresh_max: Maximum allowed threshold of tcp-data-split-threshold */ struct kernel_ethtool_ringparam { u32 rx_buf_len; @@ -87,6 +89,8 @@ struct kernel_ethtool_ringparam { u32 cqe_size; u32 tx_push_buf_len; u32 tx_push_buf_max_len; + u32 tcp_data_split_thresh; + u32 tcp_data_split_thresh_max; }; /** diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index 283305f6b063..20fe6065b7ba 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -364,6 +364,8 @@ enum { ETHTOOL_A_RINGS_RX_PUSH, /* u8 */ ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN, /* u32 */ ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX, /* u32 */ + ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH, /* u32 */ + ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH_MAX, /* u32 */ /* add new constants above here */ __ETHTOOL_A_RINGS_CNT, diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h index 203b08eb6c6f..8bea47a26605 100644 --- a/net/ethtool/netlink.h +++ b/net/ethtool/netlink.h @@ -455,7 +455,7 @@ extern const struct nla_policy ethnl_features_set_policy[ETHTOOL_A_FEATURES_WANT extern const struct nla_policy ethnl_privflags_get_policy[ETHTOOL_A_PRIVFLAGS_HEADER + 1]; extern const struct nla_policy ethnl_privflags_set_policy[ETHTOOL_A_PRIVFLAGS_FLAGS + 1]; extern const struct nla_policy ethnl_rings_get_policy[ETHTOOL_A_RINGS_HEADER + 1]; -extern const struct nla_policy ethnl_rings_set_policy[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX + 1]; +extern const struct nla_policy ethnl_rings_set_policy[ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH_MAX + 1]; extern const struct nla_policy ethnl_channels_get_policy[ETHTOOL_A_CHANNELS_HEADER + 1]; extern const struct nla_policy ethnl_channels_set_policy[ETHTOOL_A_CHANNELS_COMBINED_COUNT + 1]; extern const struct nla_policy ethnl_coalesce_get_policy[ETHTOOL_A_COALESCE_HEADER + 1]; diff --git a/net/ethtool/rings.c b/net/ethtool/rings.c index b7865a14fdf8..c7824515857f 100644 --- a/net/ethtool/rings.c +++ b/net/ethtool/rings.c @@ -61,7 +61,9 @@ static int rings_reply_size(const struct ethnl_req_info *req_base, nla_total_size(sizeof(u8)) + /* _RINGS_TX_PUSH */ nla_total_size(sizeof(u8))) + /* _RINGS_RX_PUSH */ nla_total_size(sizeof(u32)) + /* _RINGS_TX_PUSH_BUF_LEN */ - nla_total_size(sizeof(u32)); /* _RINGS_TX_PUSH_BUF_LEN_MAX */ + nla_total_size(sizeof(u32)) + /* _RINGS_TX_PUSH_BUF_LEN_MAX */ + nla_total_size(sizeof(u32)) + /* _RINGS_TCP_DATA_SPLIT_THRESH */ + nla_total_size(sizeof(u32)); /* _RINGS_TCP_DATA_SPLIT_THRESH_MAX */ } static int rings_fill_reply(struct sk_buff *skb, @@ -108,7 +110,13 @@ static int rings_fill_reply(struct sk_buff *skb, (nla_put_u32(skb, ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX, kr->tx_push_buf_max_len) || nla_put_u32(skb, ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN, - kr->tx_push_buf_len)))) + kr->tx_push_buf_len))) || + (kr->tcp_data_split == ETHTOOL_TCP_DATA_SPLIT_ENABLED && + (nla_put_u32(skb, ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH, + kr->tcp_data_split_thresh))) || + (kr->tcp_data_split == ETHTOOL_TCP_DATA_SPLIT_ENABLED && + (nla_put_u32(skb, ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH_MAX, + kr->tcp_data_split_thresh_max)))) return -EMSGSIZE; return 0; @@ -130,6 +138,7 @@ const struct nla_policy ethnl_rings_set_policy[] = { [ETHTOOL_A_RINGS_TX_PUSH] = NLA_POLICY_MAX(NLA_U8, 1), [ETHTOOL_A_RINGS_RX_PUSH] = NLA_POLICY_MAX(NLA_U8, 1), [ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN] = { .type = NLA_U32 }, + [ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH] = { .type = NLA_U32 }, }; static int @@ -155,6 +164,14 @@ ethnl_set_rings_validate(struct ethnl_req_info *req_info, return -EOPNOTSUPP; } + if (tb[ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH] && + !(ops->supported_ring_params & ETHTOOL_RING_USE_TCP_DATA_SPLIT)) { + NL_SET_ERR_MSG_ATTR(info->extack, + tb[ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH], + "setting tcp-data-split-thresh is not supported"); + return -EOPNOTSUPP; + } + if (tb[ETHTOOL_A_RINGS_CQE_SIZE] && !(ops->supported_ring_params & ETHTOOL_RING_USE_CQE_SIZE)) { NL_SET_ERR_MSG_ATTR(info->extack, @@ -196,9 +213,9 @@ ethnl_set_rings(struct ethnl_req_info *req_info, struct genl_info *info) struct kernel_ethtool_ringparam kernel_ringparam = {}; struct ethtool_ringparam ringparam = {}; struct net_device *dev = req_info->dev; + bool mod = false, thresh_mod = false; struct nlattr **tb = info->attrs; const struct nlattr *err_attr; - bool mod = false; int ret; dev->ethtool_ops->get_ringparam(dev, &ringparam, @@ -222,9 +239,30 @@ ethnl_set_rings(struct ethnl_req_info *req_info, struct genl_info *info) tb[ETHTOOL_A_RINGS_RX_PUSH], &mod); ethnl_update_u32(&kernel_ringparam.tx_push_buf_len, tb[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN], &mod); - if (!mod) + ethnl_update_u32(&kernel_ringparam.tcp_data_split_thresh, + tb[ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH], + &thresh_mod); + if (!mod && !thresh_mod) return 0; + if (kernel_ringparam.tcp_data_split == ETHTOOL_TCP_DATA_SPLIT_DISABLED && + thresh_mod) { + NL_SET_ERR_MSG_ATTR(info->extack, + tb[ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH], + "tcp-data-split-thresh can not be updated while tcp-data-split is disabled"); + return -EINVAL; + } + + if (kernel_ringparam.tcp_data_split_thresh > + kernel_ringparam.tcp_data_split_thresh_max) { + NL_SET_ERR_MSG_ATTR_FMT(info->extack, + tb[ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH_MAX], + "Requested tcp-data-split-thresh exceeds the maximum of %u", + kernel_ringparam.tcp_data_split_thresh_max); + + return -EINVAL; + } + /* ensure new ring parameters are within limits */ if (ringparam.rx_pending > ringparam.rx_max_pending) err_attr = tb[ETHTOOL_A_RINGS_RX]; -- 2.34.1