On Thu, Jan 16, 2025 at 02:51:11PM -0800, James Houghton wrote: > I guess this might not work if QEMU *needs* to use HugeTLB for > whatever reason, but Google's hypervisor just needs 1G pages; it > doesn't matter where they come from really. I see now. Yes I suppose it works for QEMU too. [...] > > In that case, looks like userfaultfd can support CoCo on device emulations > > by sticking with virtual-address traps like before, at least from that > > specific POV. > > Yeah, I don't think the userfaultfd API needs to change to support > gmem, because it's going to be using the VMAs / user mappings of gmem. There's other things I am still thinking on how the notification could happen when CoCo is enabled, that especially when there's no vcpu context. The first thing is any PV interfaces, and what's currently in my mind is kvmclock. I suppose that could work like untrusted dmas, so that when the hypervisor wants to read/update the clock struct, it'll access a shared page and then the guest can move it from/to to a private page. Or I'm not sure whether such information is proven to be not sensitive data, so the guest can directly use a permanent shared page for such purpose (in which case should still be part of guest memory, hence access to it can be trapped just like other shared pages via userfaultfd). The other thing is after I read the SEV-TIO then I found it could be easy to implement page faults for trusted devices now. For example, the white paper said the host IOMMU will be responsible to translating trusted devices' DMA into GPA/GVA, I think it means KVM would somehow share the secondary pgtable to the IOMMU, and probably when DMA sees a missing page it can now easily generate a page fault to the secondary page table. However the question is this is a DMA op and it definitely also doesn't have a vcpu context. So the question is how to trap it. So.. maybe (fd, offset) support might still be needed at some point, which can be more future proof. But I don't think I have a solid mind yet. -- Peter Xu