From: Eric Dumazet <edumazet@xxxxxxxxxx> Date: Wed, 5 Feb 2025 18:48:50 +0100 > On Wed, Feb 5, 2025 at 5:46 PM Alexander Lobakin > <aleksander.lobakin@xxxxxxxxx> wrote: >> >> In fact, these two are not tied closely to each other. The only >> requirements to GRO are to use it in the BH context and have some >> sane limits on the packet batches, e.g. NAPI has a limit of its >> budget (64/8/etc.). >> Move purely GRO fields into a new tagged group, &gro_node. Embed it >> into &napi_struct and adjust all the references. napi_id doesn't >> really belong to GRO, but: >> >> 1. struct gro_node has a 4-byte padding at the end anyway. If you >> leave napi_id outside, struct napi_struct takes additional 8 bytes >> (u32 napi_id + another 4-byte padding). >> 2. gro_receive_skb() uses it to mark skbs. We don't want to split it >> into two functions or add an `if`, as this would be less efficient, >> but we need it to be NAPI-independent. The current approach doesn't >> change anything for NAPI-backed GROs; for standalone ones (which >> are less important currently), the embedded napi_id will be just >> zero => no-op. >> >> Three Ethernet drivers use napi_gro_flush() not really meant to be >> exported, so move it to <net/gro.h> and add that include there. >> napi_gro_receive() is used in more than 100 drivers, keep it >> in <linux/netdevice.h>. >> This does not make GRO ready to use outside of the NAPI context >> yet. >> >> Tested-by: Daniel Xu <dxu@xxxxxxxxx> >> Acked-by: Jakub Kicinski <kuba@xxxxxxxxxx> >> Reviewed-by: Toke Høiland-Jørgensen <toke@xxxxxxxxxx> >> Signed-off-by: Alexander Lobakin <aleksander.lobakin@xxxxxxxxx> >> --- >> include/linux/netdevice.h | 26 +++++--- >> include/net/busy_poll.h | 11 +++- >> include/net/gro.h | 35 +++++++---- >> drivers/net/ethernet/brocade/bna/bnad.c | 1 + >> drivers/net/ethernet/cortina/gemini.c | 1 + >> drivers/net/wwan/t7xx/t7xx_hif_dpmaif_rx.c | 1 + >> net/core/dev.c | 60 ++++++++----------- >> net/core/gro.c | 69 +++++++++++----------- >> 8 files changed, 112 insertions(+), 92 deletions(-) >> >> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h >> index 2a59034a5fa2..d29b6ebde73f 100644 >> --- a/include/linux/netdevice.h >> +++ b/include/linux/netdevice.h >> @@ -340,8 +340,8 @@ struct gro_list { >> }; >> >> /* >> - * size of gro hash buckets, must less than bit number of >> - * napi_struct::gro_bitmask >> + * size of gro hash buckets, must be <= the number of bits in >> + * gro_node::bitmask >> */ >> #define GRO_HASH_BUCKETS 8 >> >> @@ -370,7 +370,6 @@ struct napi_struct { >> unsigned long state; >> int weight; >> u32 defer_hard_irqs_count; >> - unsigned long gro_bitmask; >> int (*poll)(struct napi_struct *, int); >> #ifdef CONFIG_NETPOLL >> /* CPU actively polling if netpoll is configured */ >> @@ -379,11 +378,14 @@ struct napi_struct { >> /* CPU on which NAPI has been scheduled for processing */ >> int list_owner; >> struct net_device *dev; >> - struct gro_list gro_hash[GRO_HASH_BUCKETS]; >> struct sk_buff *skb; >> - struct list_head rx_list; /* Pending GRO_NORMAL skbs */ >> - int rx_count; /* length of rx_list */ >> - unsigned int napi_id; /* protected by netdev_lock */ >> + struct_group_tagged(gro_node, gro, >> + unsigned long bitmask; >> + struct gro_list hash[GRO_HASH_BUCKETS]; >> + struct list_head rx_list; /* Pending GRO_NORMAL skbs */ >> + int rx_count; /* length of rx_list */ >> + u32 napi_id; /* protected by netdev_lock */ >> + > > I am old school, I would prefer a proper/standalone old C construct. > > struct gro_node { > unsigned long bitmask; > struct gro_list hash[GRO_HASH_BUCKETS]; > struct list_head rx_list; /* Pending GRO_NORMAL skbs */ > int rx_count; /* length of rx_list */ > u32 napi_id; /* protected by netdev_lock */ > }; > > Really, what struct_group_tagged() can possibly bring here, other than > obfuscation ? You'd need to adjust every ->napi_id access, which is a lot. Plus, as I wrote previously, napi_id doesn't really belong here, but embedding it here eases life. I'm often an old school, too, but sometimes this helps a lot. Unless you have very strong preference on this. > > Less than 30 uses in the whole kernel tree... Thanks, Olek