On Thu, Jan 09, 2025 at 09:09:46AM +0100, Christian König wrote: > Answering on my reply once more as pure text mail. It is hard to do anything with your HTML mails :\ > > Well you were also the person who mangled the struct page pointers in > > the scatterlist because people were abusing this and getting a bloody > > nose :) But alot of this is because scatterlist is too limited, you actually can't correctly describe anything except struct page backed CPU memory in a scatterlist. As soon as we can correctly describe everything in a datastructure these issues go away - or at least turn into a compatability exchange problem. > > > > Where I do agree with Christian is that stuffing pfn support into the > > > > dma_buf_attachment interfaces feels a bit much wrong. > > > So it could a dmabuf interface like mmap/vmap()? I was also wondering > > > about that. But finally I start to use dma_buf_attachment interface > > > because of leveraging existing buffer pin and move_notify. > > > > Exactly that's the point, sharing pfn doesn't work with the pin and > > move_notify interfaces because of the MMU notifier approach Sima > > mentioned. Huh? mmu notifiers are for tracking changes to VMAs pin/move_notify are for tracking changes the the underlying memory of a DMABUF. How does sharing the PFN vs DMA addre effect the pin/move_notify lifetime rules at all? > > > > > > > 3) Importing devices need to know if they are working with PCI P2P > > > > > > > addresses during mapping because they need to do things like turn on > > > > > > > ATS on their DMA. As for multi-path we have the same hacks inside mlx5 > > > > > > > today that assume DMABUFs are always P2P because we cannot determine > > > > > > > if things are P2P or not after being DMA mapped. > > > > > > Why would you need ATS on PCI P2P and not for system memory accesses? > > > > > ATS has a significant performance cost. It is mandatory for PCI P2P, > > > > > but ideally should be avoided for CPU memory. > > > > Huh, I didn't know that. And yeah kinda means we've butchered the pci p2p > > > > stuff a bit I guess ... > > > > Hui? Why should ATS be mandatory for PCI P2P? I should say "mandatory on some configurations" If you need the iommu turned on, and you have a PCI switch in your path, then ATS allows you to have full P2P bandwidth and retain full IOMMU security. > > We have tons of production systems using PCI P2P without ATS. And it's > > the first time I hear that. It is situational and topologically dependent. We have very large number of deployed systems now that rely on ATS for PCI P2P. > > As Sima explained you either have follow_pfn() and mmu_notifier or you > > have DMA addresses and dma_resv lock / dma_fence. > > > > Just giving out PFNs without some lifetime associated with them is one > > of the major problems we faced before and really not something you can > > do. Certainly I never imagined there would be no liftime, I expect anything coming out of the dmabuf interface to use the dma_resv lock, fence and move_notify for lifetime managament, regardless of how the target memory is described. > > > > separate access mechanism just for that. It would be the 5th or so (kernel > > > > vmap, userspace mmap, dma_buf_attach and driver private stuff like > > > > virtio_dma_buf.c where you access your buffer with a uuid), so really not > > > > a big deal. > > > OK, will think more about that. > > > > Please note that we have follow_pfn() + mmu_notifier working for KVM/XEN > > with MMIO mappings and P2P. And that required exactly zero DMA-buf > > changes :) > > I don't fully understand your use case, but I think it's quite likely > > that we already have that working. In Intel CC systems you cannot mmap secure memory or the system will take a machine check. You have to convey secure memory inside a FD entirely within the kernel so that only an importer that understands how to handle secure memory (such as KVM) is using it to avoid machine checking. The patch series here should be thought of as the first part of this, allowing PFNs to flow without VMAs. IMHO the second part of preventing machine checks is not complete. In the approach I have been talking about the secure memory would be represented by a p2p_provider structure that is incompatible with everything else. For instance importers that can only do DMA would simply cleanly fail when presented with this memory. Jason