On Fri, Nov 1, 2024 at 11:16 AM Pavel Begunkov <asml.silence@xxxxxxxxx> wrote: > > On 11/1/24 17:33, Mina Almasry wrote: > > On Tue, Oct 29, 2024 at 4:06 PM David Wei <dw@xxxxxxxxxxx> wrote: > >> > >> From: Pavel Begunkov <asml.silence@xxxxxxxxx> > >> > >> Add a helper that takes an array of pages and initialises passed in > >> memory provider's area with them, where each net_iov takes one page. > >> It's also responsible for setting up dma mappings. > >> > >> We keep it in page_pool.c not to leak netmem details to outside > >> providers like io_uring, which don't have access to netmem_priv.h > >> and other private helpers. > >> > > > > I honestly prefer leaking netmem_priv.h details into the io_uring > > rather than having io_uring provider specific code in page_pool.c. > > Even though Jakub didn't comment on this patch, but he definitely > wasn't fond of giving all those headers to non net/ users. I guess > I can't please everyone. One middle option is to make the page > pool helper more granular, i.e. taking care of one netmem at > a time, and moving the loop to io_uring, but I don't think it > changes anything. > My issue is that these: +int page_pool_mp_init_paged_area(struct page_pool *pool, + struct net_iov_area *area, + struct page **pages); +void page_pool_mp_release_area(struct page_pool *pool, Are masquerading as generic functions to be used by many mp but they're really io_uring specific. dmabuf and the hugepage provider would not use them AFAICT. Would have liked not to have code specific to one mp in page_pool.c, and I was asked to move the dmabuf specific functions to another file too IIRC. These helpers depend on: page_pool_set_pp_info: in netmem_priv.h net_iov_to_netmem(niov): in netmem.h page_pool_dma_map_page: can be put in page_pool/helpers.h? page_pool_release_page_dma(pool, netmem): can be put in page_pool/helpers.h? I would prefer moving page_pool_set_pp_info (and the stuff it calls into) to netmem.h and removing io_uring mp specific code from page_pool.c. > ... > >> #include <linux/dma-direction.h> > >> @@ -459,7 +460,8 @@ page_pool_dma_sync_for_device(const struct page_pool *pool, > >> __page_pool_dma_sync_for_device(pool, netmem, dma_sync_size); > >> } > >> > >> -static bool page_pool_dma_map(struct page_pool *pool, netmem_ref netmem) > >> +static bool page_pool_dma_map_page(struct page_pool *pool, netmem_ref netmem, > >> + struct page *page) > > > > I have to say this is confusing for me. Passing in both the netmem and > > the page is weird. The page is the one being mapped and the > > netmem->dma_addr is the one being filled with the mapping. > > the page argument provides a mapping, the netmem gives the object > where the mapping is set. netmem could be the same as the page > argument, but I don't think it's inherently wrong, and it's an > internal helper anyway. I can entirely copy paste the function, I > don't think it's anyhow an improvement. > > > Netmem is meant to be an abstraction over page. Passing both makes > > little sense to me. The reason you're doing this is because the > > io_uring memory provider is in a bit of a weird design IMO where the > > memory comes in pages but it doesn't want to use paged-backed-netmem. > > Mina, as explained it before, I view it rather as an abstraction > that helps with finer grained control over memory and extending > it this way, I don't think it's such a stretch, and it doesn't > change much for the networking stack overall. Not fitting into > devmem TCP category doesn't make it weird. > > > Instead it uses net_iov-backed-netmem and there is an out of band page > > to be managed. > > > > I think it would make sense to use paged-backed-netmem for your use > > case, or at least I don't see why it wouldn't work. Memory providers > > It's a user page, we can't make assumptions about it, we can't > reuse space in struct page like for pp refcounting (unlike when > it's allocated by the kernel), we can't use the normal page > refcounting. > You actually can't reuse space in struct net_iov either for your own refcounting, because niov->pp_ref_count is a mirror to page->pp_ref_count and strictly follows the patterns of that. But that's the issue to be discussed on the other patch... > If that's the direction people prefer, we can roll back to v1 from > a couple years ago, fill skbs fill user pages, attach ubuf_info to > every skb, and whack-a-mole'ing all places where the page could be > put down or such, pretty similarly what net_iov does. Honestly, I > thought that reusing common infra so that the net stack doesn't > need a different path per interface was a good idea. > The common infra does support page-backed-netmem actually. > > were designed to handle the hugepage usecase where the memory > > allocated by the provider is pages. Is there a reason that doesn't > > work for you as well? > > > > If you really need to use net_iov-backed-netmem, can we put this > > weirdness in the provider? I don't know that we want a generic-looking > > dma_map function which is a bit confusing to take a netmem and a page.> > ... > >> + > >> +static void page_pool_release_page_dma(struct page_pool *pool, > >> + netmem_ref netmem) > >> +{ > >> + __page_pool_release_page_dma(pool, netmem); > >> +} > >> + > > > > Is this wrapper necessary? Do you wanna rename the original function > > to remove the __ instead of a adding a wrapper? > > I only added it here to cast away __always_inline since it's used in > a slow / setup path. It shouldn't change the binary, but I'm not a huge > fan of leaving the hint for the code where it's not needed. > Right, it probably makes sense to make the modifications you want on the original function rather than create a no-op wrapper to remove the __always_inline. -- Thanks, Mina