On Tue, Jul 11, 2023 at 6:11 AM Jason Gunthorpe <jgg@xxxxxxxx> wrote: > > On Mon, Jul 10, 2023 at 05:45:05PM -0700, Mina Almasry wrote: > > > > At least from my position I want to see MEMORY_DEVICE_PCI_P2PDMA used > > > to represent P2P memory. > > > > Would using p2pdma API instead of dmabuf be an acceptable direction? > > "p2pdma API" is really just using MEMORY_DEVICE_PCI_P2PDMA and > teaching the pagepool how to work with ZONE_DEVICE pages. > Yes, that's what I have in mind. Roughly something along the lines where the device memory provider (GPU or what not) does something like a pci_p2pdma_add_resource(), and the NIC client driver allocates the pages from the resource, and if is_pci_p2pdma_page() use pci_p2pdma_map_sg() instead of dma_map_sg() and whatnot. > I suspect this will clash badly with Matthew's work here: > > https://lore.kernel.org/all/20230111042214.907030-1-willy@xxxxxxxxxxxxx/ > > As from a mm side we haven't ever considered that ZONE_DEVICE and > "netmem" can be composed together. The entire point of netmem like > stuff is that the allocator hands over the majority of struct page to > the allocatee, and ZONE_DEVICE can't work like that. > > However, assuming that can be solved in some agreeable way then it > would be OK to go down this path. > I think there is potential to solve this in an agreeable way. We may be able to get netmem ZONE_DEVICE pages using the memdesc idea you proposed, or something like the xarray meta data you mention below. If not that, I think Jakub already said that he is considering coming up with a 'page pool like' API that drivers can use, with an implementation that is compatible with ZONE_DEVICE pages. > But, I feel like this is just overall too hard a direction from the mm > perspective. > > I don't know anything about page pool, but the main sticking point is > its reliance on struct page. If it can find another way to locate its > meta data (eg an xarray), at least for some cases, it would make > things alot easier. > -- Thanks, Mina