On Sun, Jan 26, 2025 at 11:34:29AM +0800, Xu Yilun wrote: > > Definitely not suggesting to install an invalid pointer anywhere. The > > mapped pointer will still be valid for gmem for example, but the fault > > isn't. We need to differenciate two things (1) virtual address mapping, > > then (2) permission and accesses on the folios / pages of the mapping. > > Here I think it's okay if the host pointer is correctly mapped. > > > > For your private MMIO use case, my question is if there's no host pointer > > to be mapped anyway, then what's the benefit to make the MR to be ram=on? > > Can we simply make it a normal IO memory region? The only benefit of a > > The guest access to normal IO memory region would be emulated by QEMU, > while private assigned MMIO requires guest direct access via Secure EPT. > > Seems the existing code doesn't support guest direct access if > mr->ram == false: Ah it's about this, ok. I am not sure what's the best approach, but IMHO it's still better we stick with host pointer always available when ram=on. OTOH, VFIO private regions may be able to provide a special mark somewhere, just like when romd_mode was done previously as below (qemu 235e8982ad39), so that KVM should still apply these MRs even if they're not RAM. > > static void kvm_set_phys_mem(KVMMemoryListener *kml, > MemoryRegionSection *section, bool add) > { > [...] > > if (!memory_region_is_ram(mr)) { > if (writable || !kvm_readonly_mem_allowed) { > return; > } else if (!mr->romd_mode) { > /* If the memory device is not in romd_mode, then we actually want > * to remove the kvm memory slot so all accesses will trap. */ > add = false; > } > } > > [...] > > /* register the new slot */ > do { > > [...] > > err = kvm_set_user_memory_region(kml, mem, true); > } > } > > > ram=on MR is, IMHO, being able to be accessed as RAM-like. If there's no > > host pointer at all, I don't yet understand how that helps private MMIO > > from working. > > I expect private MMIO not accessible from host, but accessible from > guest so has kvm_userspace_memory_region2 set. That means the resolving > of its PFN during EPT fault cannot depends on host pointer. > > https://lore.kernel.org/all/20250107142719.179636-1-yilun.xu@xxxxxxxxxxxxxxx/ I'll leave this to KVM experts, but I actually didn't follow exactly on why mmu notifier is an issue to make , as I thought that was per-mm anyway, and KVM should logically be able to skip all VFIO private MMIO regions if affected. This is a comment to this part of your commit message: Rely on userspace mapping also means private MMIO mapping should follow userspace mapping change via mmu_notifier. This conflicts with the current design that mmu_notifier never impacts private mapping. It also makes no sense to support mmu_notifier just for private MMIO, private MMIO mapping should be fixed when CoCo-VM accepts the private MMIO, any following mapping change without guest permission should be invalid. So I don't yet see a hard-no of reusing userspace mapping even if they're not faultable as of now - what if they can be faultable in the future? I am not sure.. OTOH, I also don't think we need KVM_SET_USER_MEMORY_REGION3 anyway.. The _REGION2 API is already smart enough to leave some reserved fields: /* for KVM_SET_USER_MEMORY_REGION2 */ struct kvm_userspace_memory_region2 { __u32 slot; __u32 flags; __u64 guest_phys_addr; __u64 memory_size; __u64 userspace_addr; __u64 guest_memfd_offset; __u32 guest_memfd; __u32 pad1; __u64 pad2[14]; }; I think we _could_ reuse some pad*? Reusing guest_memfd field sounds error prone to me. Not sure it could be easier if it's not guest_memfd* but fd + fd_offset since the start. But I guess when introducing _REGION2 we didn't expect MMIO private regions come so soon.. Thanks, -- Peter Xu