On Thu, Jan 09, 2025 at 07:06:48AM +0800, Xu Yilun wrote: > > So I guess my first question is, which locking rules do you want here for > > pfn importers? > > > > follow_pfn() is unwanted for private MMIO, so dma_resv_lock. > > > > As Sima explained you either have follow_pfn() and mmu_notifier or you > > have DMA addresses and dma_resv lock / dma_fence. > > > > Just giving out PFNs without some lifetime associated with them is one of > > the major problems we faced before and really not something you can do. > > I'm trying to make exporter give out PFN with lifetime control via > move_notify() in this series. May not be conceptually correct but seems > possible. > > > > > > > If mmu notifiers is fine, then I think the current approach of follow_pfn > > should be ok. But if you instead dma_resv_lock rules (or the cpu mmap > > somehow is an issue itself), then I think the clean design is create a new > > > > cpu mmap() is an issue, this series is aimed to eliminate userspace > > mapping for private MMIO resources. > > > > Why? > > OK, I can start from here. > > It is about the Secure guest, or CoCo VM. The memory and MMIOs assigned > to this kind of guest is unaccessable to host itself, by leveraging HW > encryption & access control technology (e.g. Intel TDX, AMD SEV-SNP ...). > This is to protect the tenant data being stolen by CSP itself. > > The impact is when host accesses the encrypted region, bad things > happen to system, e.g. memory corruption, MCE. Kernel is trying to > mitigate most of the impact by alloc and assign user unmappable memory > resources (private memory) to guest, which prevents userspace > accidents. guest_memfd is the private memory provider that only allows > for KVM to position the page/pfn by fd + offset and create secondary > page table (EPT, NPT...), no host mapping, no VMA, no mmu_notifier. But > the lifecycle of the private memory is still controlled by guest_memfd. > When fallocate(fd, PUNCH_HOLE), the memory resource is revoked and KVM > is notified to unmap corresponding EPT. > > The further thought is guest_memfd is also suitable for normal guest. > It makes no sense VMM must build host mapping table before guest access. > > Now I'm trying to seek a similar way for private MMIO. A MMIO resource > provider that is exported as an fd. It controls the lifecycle of the > MMIO resource and notify KVM when revoked. dma-buf seems to be a good > provider which have done most of the work, only need to extend the > memory resource seeking by fd + offset. So if I'm getting this right, what you need from a functional pov is a dma_buf_tdx_mmap? Because due to tdx restrictions, the normal dma_buf_mmap is not going to work I guess? Also another thing that's a bit tricky is that kvm kinda has a 3rd dma-buf memory model: - permanently pinned dma-buf, they never move - dynamic dma-buf, they move through ->move_notify and importers can remap - revocable dma-buf, which thus far only exist for pci mmio resources Since we're leaning even more on that 3rd model I'm wondering whether we should make it something official. Because the existing dynamic importers do very much assume that re-acquiring the memory after move_notify will work. But for the revocable use-case the entire point is that it will never work. I feel like that's a concept we need to make explicit, so that dynamic importers can reject such memory if necessary. So yeah there's a bunch of tricky lifetime questions that need to be sorted out with proper design I think, and the current "let's just use pfn directly" proposal hides them all under the rug. I agree with Christian that we need a bit more care here. -Sima > > > > > separate access mechanism just for that. It would be the 5th or so (kernel > > vmap, userspace mmap, dma_buf_attach and driver private stuff like > > virtio_dma_buf.c where you access your buffer with a uuid), so really not > > a big deal. > > > > OK, will think more about that. > > > > Please note that we have follow_pfn() + mmu_notifier working for KVM/XEN > > Folow_pfn & mmu_notifier won't work here, cause no VMA, no host mapping > table. > > Thanks, > Yilun > > with MMIO mappings and P2P. And that required exactly zero DMA-buf changes > > :) > > > > I don't fully understand your use case, but I think it's quite likely that > > we already have that working. > > > > Regards, > > Christian. -- Simona Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch