On Mon, Oct 5, 2020 at 8:37 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote: > > On Mon, Oct 05, 2020 at 08:16:33PM +0200, Daniel Vetter wrote: > > > > kvm is some similar hack added for P2P DMA, see commit > > > add6a0cd1c5ba51b201e1361b05a5df817083618. It might be protected by notifiers.. > > > > Yeah my thinking is that kvm (and I think also vfio, also seems to > > have mmu notifier nearby) are ok because of the mmu notiifer. Assuming > > that one works correctly. > > vfio doesn't have a notifier, Alex was looking to add a vfio private > scheme in the vma->private_data: > > https://lore.kernel.org/kvm/159017449210.18853.15037950701494323009.stgit@xxxxxxxxxx/ > > Guess it never happened. I was mislead by the mmu notifier in drivers/vfio/vfio.c. But looking closer, that's only used by some drivers, I guess to make sure their device pagetables are kept in sync with reality. And not to make sure the vfio pfn view is kept in sync with reality. This could get real nasty I think. > > > So, the answer really is that s390 and media need fixing, and this API > > > should go away (or become kvm specific) > > > > I'm still not clear how you want fo fix this, since your vma->dma_buf > > idea is kinda a decade long plan and so just not going to happen: > > Well, it doesn't mean we have to change every part of dma_buf to > participate in this. Just the bits media cares about. Or maybe it is > some higher level varient on top of dma_buf. > > Or don't use dma_buf for this, add a new object that just provides > refcounts and P2P DMA connection for IO pfn ranges.. So good news is, I dug some layers deeper in v4l, and there's only 2 users which do actually handle pfn and don't immediately convert to a pages array: - videbuf-dma-contig.c. Luckily videobuf 1 is deprecated since forever, so I think we might get away with either just breaking this, or at least tainting kernels and hiding it behind a nasty Kconfig. This only uses follow_pfn, which we need to keep anyway for vfio in the unsafe variant :-/ - videbuf2-vmalloc.c Digging through history this was added to support import of v4l buffers from drivers that needed contig memory. And way back before CMA, that meant carveout memory not backed by struct page *. That should now all have struct pages and be managed by CMA (since videbuf2-dma-contig.c just uses dma_alloc_coherent underneath), so I think we can just switch to pin_user_pages(FOLL_LONGTERM here too). iow I think I can outright delete the frame vector stuff. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch