On Wed, Jan 08, 2025 at 07:44:54PM +0100, Simona Vetter wrote: > On Wed, Jan 08, 2025 at 12:22:27PM -0400, Jason Gunthorpe wrote: > > On Wed, Jan 08, 2025 at 04:25:54PM +0100, Christian König wrote: > > > Am 08.01.25 um 15:58 schrieb Jason Gunthorpe: > > > > I have imagined a staged approach were DMABUF gets a new API that > > > > works with the new DMA API to do importer mapping with "P2P source > > > > information" and a gradual conversion. > > > > > > To make it clear as maintainer of that subsystem I would reject such a step > > > with all I have. > > > > This is unexpected, so you want to just leave dmabuf broken? Do you > > have any plan to fix it, to fix the misuse of the DMA API, and all > > the problems I listed below? This is a big deal, it is causing real > > problems today. > > > > If it going to be like this I think we will stop trying to use dmabuf > > and do something simpler for vfio/kvm/iommufd :( > > As the gal who help edit the og dma-buf spec 13 years ago, I think adding > pfn isn't a terrible idea. By design, dma-buf is the "everything is > optional" interface. And in the beginning, even consistent locking was > optional, but we've managed to fix that by now :-/ > > Where I do agree with Christian is that stuffing pfn support into the > dma_buf_attachment interfaces feels a bit much wrong. So it could a dmabuf interface like mmap/vmap()? I was also wondering about that. But finally I start to use dma_buf_attachment interface because of leveraging existing buffer pin and move_notify. > > > > We have already gone down that road and it didn't worked at all and > > > was a really big pain to pull people back from it. > > > > Nobody has really seriously tried to improve the DMA API before, so I > > don't think this is true at all. > > Aside, I really hope this finally happens! > > > > > 3) Importing devices need to know if they are working with PCI P2P > > > > addresses during mapping because they need to do things like turn on > > > > ATS on their DMA. As for multi-path we have the same hacks inside mlx5 > > > > today that assume DMABUFs are always P2P because we cannot determine > > > > if things are P2P or not after being DMA mapped. > > > > > > Why would you need ATS on PCI P2P and not for system memory accesses? > > > > ATS has a significant performance cost. It is mandatory for PCI P2P, > > but ideally should be avoided for CPU memory. > > Huh, I didn't know that. And yeah kinda means we've butchered the pci p2p > stuff a bit I guess ... > > > > > 5) iommufd and kvm are both using CPU addresses without DMA. No > > > > exporter mapping is possible > > > > > > We have customers using both KVM and XEN with DMA-buf, so I can clearly > > > confirm that this isn't true. > > > > Today they are mmaping the dma-buf into a VMA and then using KVM's > > follow_pfn() flow to extract the CPU pfn from the PTE. Any mmapable > > dma-buf must have a CPU PFN. > > > > Here Xu implements basically the same path, except without the VMA > > indirection, and it suddenly not OK? Illogical. > > So the big difference is that for follow_pfn() you need mmu_notifier since > the mmap might move around, whereas with pfn smashed into > dma_buf_attachment you need dma_resv_lock rules, and the move_notify > callback if you go dynamic. > > So I guess my first question is, which locking rules do you want here for > pfn importers? follow_pfn() is unwanted for private MMIO, so dma_resv_lock. > > If mmu notifiers is fine, then I think the current approach of follow_pfn > should be ok. But if you instead dma_resv_lock rules (or the cpu mmap > somehow is an issue itself), then I think the clean design is create a new cpu mmap() is an issue, this series is aimed to eliminate userspace mapping for private MMIO resources. > separate access mechanism just for that. It would be the 5th or so (kernel > vmap, userspace mmap, dma_buf_attach and driver private stuff like > virtio_dma_buf.c where you access your buffer with a uuid), so really not > a big deal. OK, will think more about that. Thanks, Yilun > > And for non-contrived exporters we might be able to implement the other > access methods in terms of the pfn method generically, so this wouldn't > even be a terrible maintenance burden going forward. And meanwhile all the > contrived exporters just keep working as-is. > > The other part is that cpu mmap is optional, and there's plenty of strange > exporters who don't implement. But you can dma map the attachment into > plenty devices. This tends to mostly be a thing on SoC devices with some > very funky memory. But I guess you don't care about these use-case, so > should be ok. > > I couldn't come up with a good name for these pfn users, maybe > dma_buf_pfn_attachment? This does _not_ have a struct device, but maybe > some of these new p2p source specifiers (or a list of those which are > allowed, no idea how this would need to fit into the new dma api). > > Cheers, Sima > -- > Simona Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch