Hi Wei Lin, > Subject: Re: [PATCH 0/4] cover-letter: Allow MMIO regions to be exported > through dmabuf > >> > >> From: Wei Lin Guay <wguay@xxxxxxxx> > >> > >> This is another attempt to revive the patches posted by Jason > >> Gunthorpe and Vivek Kasireddy, at > >> https://patchwork.kernel.org/project/linux-media/cover/0-v2- > >> 472615b3877e+28f7-vfio_dma_buf_jgg@xxxxxxxxxx/ > >> > https://urldefense.com/v3/__https://lwn.net/Articles/970751/__;!!Bt8RZUm > 9aw!5uPrlSI9DhXVIeRMntADbkdWcu4YYhAzGN0OwyQ2NHxrN7bYE9f1eQBX > UD8xHHPxiJorSkYhNjIRwdG4PsD2$ > > v2: https://lore.kernel.org/dri-devel/20240624065552.1572580-1- > vivek.kasireddy@xxxxxxxxx/ > > addresses review comments from Alex and Jason and also includes the > ability > > to create the dmabuf from multiple ranges. This is really needed to future- > proof > > the feature. > > The good thing is that your patch series addressed Christian’s concerns of > adding dma_buf_try_get(). No, it did not address his concern. In v2, instead of adding a new function, we are now calling get_file_rcu() directly, which was suggested by Christian in the old original review. > > However, several questions regarding your v2 patch: > - The dma-buf still support redundant mmap handler? (One of review > comments from v1?) Yeah, the mmap handler is really needed as a debugging tool given that the importer would not be able to provide access to the dmabuf's underlying memory via the CPU in any other way. > - Rather than handle different regions within a single dma-buf, would vfio- > user open multiple distinct file descriptors work? > For our specific use case, we don't require multiple regions and prefer > Jason’s original patch. Restricting the dmabuf to a single region (or having to create multiple dmabufs to represent multiple regions/ranges associated with a single scattered buffer) would not be feasible or ideal in all cases. For instance, in my use-case, I am sharing a large framebuffer (FB) located in GPU's VRAM. And, allocating a large FB contiguously (nr_ranges = 1) in VRAM is not possible when there is memory pressure. Furthermore, since we are adding a new UAPI with this patch/feature, we cannot go back and tweak it (to add support for nr_ranges > 1) should there be a need in the future, but you can always use nr_ranges = 1 anytime. That is why it makes sense to be flexible in terms of the number of ranges/regions. > > > > > Also, my understanding is that this patchset cannot proceed until Leon's > series is merged: > > https://lore.kernel.org/kvm/cover.1733398913.git.leon@xxxxxxxxxx/ > > Thanks for the pointer. > I will rebase my local patch series on top of that. AFAIK, Leon's work includes new mechanism/APIs to do P2P DMA, which we should be using in this patchset. And, I think he is also planning to use the new APIs to augment dmabuf usage and not be forced to use the scatter-gather list particularly in cases where the underlying memory is not backed by struct page. I was just waiting for all of this to happen, before posting a v3. Thanks, Vivek > > Thanks, > Wei Lin > > > > > > > Thanks, > > Vivek > > > >> > >> In addition to the initial proposal by Jason, another promising > >> application is exposing memory from an AI accelerator (bound to VFIO) > >> to an RDMA device. This would allow the RDMA device to directly access > >> the accelerator's memory, thereby facilitating direct data > >> transactions between the RDMA device and the accelerator. > >> > >> Below is from the text/motivation from the orginal cover letter. > >> > >> dma-buf has become a way to safely acquire a handle to non-struct page > >> memory that can still have lifetime controlled by the exporter. Notably > >> RDMA can now import dma-buf FDs and build them into MRs which > allows > >> for > >> PCI P2P operations. Extend this to allow vfio-pci to export MMIO memory > >> from PCI device BARs. > >> > >> This series supports a use case for SPDK where a NVMe device will be > owned > >> by SPDK through VFIO but interacting with a RDMA device. The RDMA > device > >> may directly access the NVMe CMB or directly manipulate the NVMe > device's > >> doorbell using PCI P2P. > >> > >> However, as a general mechanism, it can support many other scenarios > with > >> VFIO. I imagine this dmabuf approach to be usable by iommufd as well for > >> generic and safe P2P mappings. > >> > >> This series goes after the "Break up ioctl dispatch functions to one > >> function per ioctl" series. > >> > >> v2: > >> - Name the new file dma_buf.c > >> - Restore orig_nents before freeing > >> - Fix reversed logic around priv->revoked > >> - Set priv->index > >> - Rebased on v2 "Break up ioctl dispatch functions" > >> v1: https://lore.kernel.org/r/0-v1-9e6e1739ed95+5fa- > >> vfio_dma_buf_jgg@xxxxxxxxxx > >> Cc: linux-rdma@xxxxxxxxxxxxxxx > >> Cc: Oded Gabbay <ogabbay@xxxxxxxxxx> > >> Cc: Christian König <christian.koenig@xxxxxxx> > >> Cc: Daniel Vetter <daniel.vetter@xxxxxxxx> > >> Cc: Leon Romanovsky <leon@xxxxxxxxxx> > >> Cc: Maor Gottlieb <maorg@xxxxxxxxxx> > >> Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx > >> Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx> > >> > >> Jason Gunthorpe (3): > >> vfio: Add vfio_device_get() > >> dma-buf: Add dma_buf_try_get() > >> vfio/pci: Allow MMIO regions to be exported through dma-buf > >> > >> Wei Lin Guay (1): > >> vfio/pci: Allow export dmabuf without move_notify from importer > >> > >> drivers/vfio/pci/Makefile | 1 + > >> drivers/vfio/pci/dma_buf.c | 291 +++++++++++++++++++++++++++++ > >> drivers/vfio/pci/vfio_pci_config.c | 8 +- > >> drivers/vfio/pci/vfio_pci_core.c | 44 ++++- > >> drivers/vfio/pci/vfio_pci_priv.h | 30 +++ > >> drivers/vfio/vfio_main.c | 1 + > >> include/linux/dma-buf.h | 13 ++ > >> include/linux/vfio.h | 6 + > >> include/linux/vfio_pci_core.h | 1 + > >> include/uapi/linux/vfio.h | 18 ++ > >> 10 files changed, 405 insertions(+), 8 deletions(-) > >> create mode 100644 drivers/vfio/pci/dma_buf.c > >> > >> -- > >> 2.43.5