On 15/06/2023 16.45, Alexander Duyck wrote: [..]
What concerns me is that you seem to be taking the page pool API in a direction other than what it was really intended for. For any physical device you aren't going to necessarily know what size fragment you are working with until you have already allocated the page and DMA has been performed. That is why drivers such as the Mellanox driver are fragmenting in the driver instead of allocated pre-fragmented pages.
+1 I share concerns with Alexander Duyck here. As the inventor and maintainer, I can say this is taking the page_pool API in a direction I didn't intent or planned for. As Alex also says, the intent was for fixed sized memory chunks that are DMA ready. Use-case was the physical device RX "early demux problem", where the size is not known before hand. I need to be convinced this is a good direction to take the page_pool design/architecture into... e.g. allocations with dynamic sizes. Maybe it is a good idea, but as below "consumers" of the API is usually the way to show this is the case. [...]
What I was getting at is that if you are going to add an API you have to have a consumer for the API. That is rule #1 for kernel API development. You don't add API without a consumer for it. The changes you are making are to support some future implementation, and I see it breaking most of the existing implementation. That is my concern.
You have mentioned veth as the use-case. I know I acked adding page_pool use-case to veth, for when we need to convert an SKB into an xdp_buff/xdp-frame, but maybe it was the wrong hammer(?). In this case in veth, the size is known at the page allocation time. Thus, using the page_pool API is wasting memory. We did this for performance reasons, but we are not using PP for what is was intended for. We mostly use page_pool, because it an existing recycle return path, and we were too lazy to add another alloc-type (see enum xdp_mem_type). Maybe you/we can extend veth to use this dynamic size API, to show us that this is API is a better approach. I will signup for benchmarking this (and coordinating with CC Maryam as she came with use-case we improved on). --Jesper