On Fri, Sep 09, 2022 at 06:24:35AM -0700, Christoph Hellwig wrote: > On Wed, Sep 07, 2022 at 01:12:52PM -0300, Jason Gunthorpe wrote: > > The PCI offset is some embedded thing - I've never seen it in a server > > platform. > > That's not actually true, e.g. some power system definitively had it, > althiugh I don't know if the current ones do. I thought those were all power embedded systems. > There is a reason why we have these proper APIs and no one has any > business bypassing them. Yes, we should try to support these things, but you said this patch didn't work and wasn't tested - that is not true at all. And it isn't like we have APIs just sitting here to solve this specific problem. So lets make something. > > So, would you be OK with this series if I try to make a dma_map_p2p() > > that resolves the offset issue? > > Well, if it also solves the other issue of invalid scatterlists leaking > outside of drm we can think about it. The scatterlist stuff has already leaked outside of DRM anyhow. Again, I think it is very problematic to let DRM get away with things and then insist all the poor non-DRM people be responsible to clean up their mess. I'm skeptical I can fix AMD GPU, but I can try to create a DMABUF op that returns something that is not a scatterlist and teach RDMA to use it. So at least the VFIO/RDMA part can avoid the scatter list abuse. I expected to need non-scatterlist for iommufd anyhow. Coupled with a series to add some dma_map_resource_pci() that handles the PCI_P2PDMA_MAP_BUS_ADDR and the PCI offset, would it be an agreeable direction? > Take a look at iommu_dma_map_sg and pci_p2pdma_map_segment to see how > this is handled. So there is a bug in all these DMABUF implementations, they do ignore the PCI_P2PDMA_MAP_BUS_ADDR "distance type". This isn't a real-world problem for VFIO because VFIO is largely incompatible with the non-ACS configuration that would trigger PCI_P2PDMA_MAP_BUS_ADDR, and explains why we never saw any problem. All our systems have ACS turned on so we can use VFIO. I'm unclear how Habana or AMD have avoided a problem here.. This is much more serious than the pci offset in my mind. Thanks, Jason