From: Lorenzo Bianconi <lorenzo@xxxxxxxxxx> Date: Fri, 12 May 2023 16:14:48 +0200 >> From: Lorenzo Bianconi <lorenzo@xxxxxxxxxx> >> Date: Fri, 12 May 2023 15:08:13 +0200 >> >>> In order to reduce page_pool memory footprint, rely on >>> page_pool_dev_alloc_frag routine and reduce buffer size >>> (VETH_PAGE_POOL_FRAG_SIZE) to PAGE_SIZE / 2 in order to consume one page >>> for two 1500B frames. Reduce VETH_XDP_PACKET_HEADROOM to 192 from 256 >>> (XDP_PACKET_HEADROOM) to fit max_head_size in VETH_PAGE_POOL_FRAG_SIZE. >>> Please note, using default values (CONFIG_MAX_SKB_FRAGS=17), maximum >>> supported MTU is now reduced to 36350B. >> >> I thought we're stepping away from page splitting bit by bit O_o > > do you mean to driver private page_split implementation? AFAIK we are not > stepping away from page_pool page split implementation (or maybe I missed it :)) Page split in general. Since early-mid 2021, Page Pool with 1 page per frame shows results comparable to drivers doing page split, but it doesn't have such MTU / headroom / ... limitations. > >> Primarily for the reasons you mentioned / worked around here: it creates >> several significant limitations and at least on 64-bit systems it >> doesn't scale anymore. 192 bytes of headroom is less than what XDP >> expects (isn't it? Isn't 256 standard-standard, so that skb XDP path >> reallocates heads only to have 256+ there?), 384 bytes of shinfo can >> change anytime and even now page split simply blocks you from increasing >> MAX_SKB_FRAGS even by one. Not speaking of MTU limitations etc. >> BTW Intel drivers suffer from the very same things due solely to page >> split (and I'm almost done with converting at least some of them to Page >> Pool and 1 page per buffer model), I don't recommend deliberately >> falling into that pit =\ :D > > I am not sure about the 192 vs 256 bytes of headroom (this is why I sent this > patch as RFC, my main goal is to discuss about this requirement). In the > previous discussion [0] we deferred this implementation since if we do not > reduce requested xdp headroom, we will not be able to fit two 1500B frames > into a single page (for skb_shared_info size [1]) and we introduce a performance > penalty. Yes, that's what I'm talking about. In order to fit two 1500-byte frames onto one page on x86_64, you need to have at most 192 bytes of headroom (192 + 320 of shinfo = 512 per frame / 1024 per page), but XDP requires 256. And then you have one more problem from the other side, I mean shinfo size. It can change anytime because it's not UAPI or ABI or whatever and nobody can say "hey, don't touch it, you break my page split", at the same time with page splitting you're not able to increase MAX_SKB_FRAGS. And for MTU > 1536 this is all worthless, just a waste of cycles. With 1 page per frame you can have up to 3.5k per fragment. You mentioned memory footprint. Do you have any exact numbers to show this can help significantly? Because I have PP on my home router with 16k-sized pages and 128 Mb of RAM and there's no memory shortage there :D I realize it doesn't mean anything and I'm mostly joking mentioning this, but still. > > Regards, > Lorenzo > > [0] https://lore.kernel.org/netdev/6298f73f7cc7391c7c4a52a6a89b1ae21488bda1.1682188837.git.lorenzo@xxxxxxxxxx/ > [1] $ pahole -C skb_shared_info vmlinux.o > struct skb_shared_info { > __u8 flags; /* 0 1 */ > __u8 meta_len; /* 1 1 */ > __u8 nr_frags; /* 2 1 */ > __u8 tx_flags; /* 3 1 */ > unsigned short gso_size; /* 4 2 */ > unsigned short gso_segs; /* 6 2 */ > struct sk_buff * frag_list; /* 8 8 */ > struct skb_shared_hwtstamps hwtstamps; /* 16 8 */ > unsigned int gso_type; /* 24 4 */ > u32 tskey; /* 28 4 */ > atomic_t dataref; /* 32 4 */ > unsigned int xdp_frags_size; /* 36 4 */ > void * destructor_arg; /* 40 8 */ > skb_frag_t frags[17]; /* 48 272 */ > > /* size: 320, cachelines: 5, members: 14 */ > }; > >> >>> >>> Signed-off-by: Lorenzo Bianconi <lorenzo@xxxxxxxxxx> >>> --- >>> drivers/net/veth.c | 39 +++++++++++++++++++++++++-------------- >>> 1 file changed, 25 insertions(+), 14 deletions(-) >> [...] >> >> Thanks, >> Olek Thanks, Olek