Jason Gunthorpe via Lsf-pc wrote: > I would like to have a session at LSF to talk about Matthew's > physr discussion starter: > > https://lore.kernel.org/linux-mm/YdyKWeU0HTv8m7wD@xxxxxxxxxxxxxxxxxxxx/ > > I have become interested in this with some immediacy because of > IOMMUFD and this other discussion with Christoph: > > https://lore.kernel.org/kvm/4-v2-472615b3877e+28f7-vfio_dma_buf_jgg@xxxxxxxxxx/ I think this is a worthwhile discussion. My main hangup with 'struct page' elimination in general is that if anything needs to be allocated to describe a physical address for other parts of the kernel to operate on it, why not a 'struct page'? There are of course several difficulties allocating a 'struct page' array, but I look at subsection support and the tail page space optimization work as evidence that some of the pain can be mitigated, what more needs to be done? I also think this is somewhat of a separate consideration than replacing a bio_vec with phyr where that has value independent of the mechanism used to manage phys_addr_t => dma_addr_t. > Which results in, more or less, we have no way to do P2P DMA > operations without struct page - and from the RDMA side solving this > well at the DMA API means advancing at least some part of the physr > idea. > > So - my objective is to enable to DMA API to "DMA map" something that > is not a scatterlist, may or may not contain struct pages, but can > still contain P2P DMA data. From there I would move RDMA MR's to use > this new API, modify DMABUF to export it, complete the above VFIO > series, and finally, use all of this to add back P2P support to VFIO > when working with IOMMUFD by allowing IOMMUFD to obtain a safe > reference to the VFIO memory using DMABUF. From there we'd want to see > pin_user_pages optimized, and that also will need some discussion how > best to structure it. > > I also have several ideas on how something like physr can optimize the > iommu driver ops when working with dma-iommu.c and IOMMUFD. > > I've been working on an implementation and hope to have something > draft to show on the lists in a few weeks. It is pretty clear there > are several interesting decisions to make that I think will benefit > from a live discussion. > > Providing a kernel-wide alternative to scatterlist is something that > has general interest across all the driver subsystems. I've started to > view the general problem rather like xarray where the main focus is to > create the appropriate abstraction and then go about transforming > users to take advatange of the cleaner abstraction. scatterlist > suffers here because it has an incredibly leaky API, a huge number of > (often sketchy driver) users, and has historically been very difficult > to improve. When I read "general interest across all the driver subsystems" it is hard not to ask "have all possible avenues to enable 'struct page' been exhausted?" > The session would quickly go over the current state of whatever the > mailing list discussion evolves into and an open discussion around the > different ideas. Sounds good to me.