On Thu, Jan 09, 2025 at 10:40:51AM -0400, Jason Gunthorpe wrote: > On Thu, Jan 09, 2025 at 12:57:58AM +0800, Xu Yilun wrote: > > On Wed, Jan 08, 2025 at 09:30:26AM -0400, Jason Gunthorpe wrote: > > > On Tue, Jan 07, 2025 at 10:27:15PM +0800, Xu Yilun wrote: > > > > Add a flag for ioctl(VFIO_DEVICE_BIND_IOMMUFD) to mark a device as > > > > for private assignment. For these private assigned devices, disallow > > > > host accessing their MMIO resources. > > > > > > Why? Shouldn't the VMM simply not call mmap? Why does the kernel have > > > to enforce this? > > > > MM.. maybe I should not say 'host', instead 'userspace'. > > > > I think the kernel part VMM (KVM) has the responsibility to enforce the > > correct behavior of the userspace part VMM (QEMU). QEMU has no way to > > touch private memory/MMIO intentionally or accidently. IIUC that's one > > of the initiative guest_memfd is introduced for private memory. Private > > MMIO follows. > > Okay, but then why is it a flag like that? I'm expecting a much This flag is a prerequisite for setting up TDI, or part of the requirement to make a "TDI capable" assigned device. It prevents the userspace mapping at the first place, even as a shared device. We want the device firstly appear as a shared device in CoCo-VM, then do TDI setup (via a tsm verb "bind"). This late bind approach avoids changing the CoCo VM startup routine. In contrast, early bind would easily be broken, especially if bios is not aware of the TDI rule. So then we face with the shared <-> private device conversion in CoCo VM, and in turn shared <-> private MMIO conversion. MMIO region has only one physical backend so it is a bit like in-place conversion which is complicated. I wanna simply the MMIO conversion routine based on the fact that VMM never needs to access assigned MMIO for feature emulation, so always disallow userspace MMIO mapping during the whole lifecycle. That's why the flag is introduced. Patch 6 has similar discription. > broader system here to make the VFIO device into a confidential device > (like setup the TDI) where we'd have to enforce the private things, I plan to introduce a new VFIO ioctl to setup the TDI. > communicate with some secure world to assign it, and so on. Yes, the new VFIO ioctl will communicate with PCI TSM. > > I want to see a fuller solution to the CC problem in VFIO before we MM.. I have something but need more preparation. Whether send out or make a public repo, I'll discuss with internal. > can be sure what is the correct UAPI. In other words, make the > VFIO device into a CC device should also prevent mmaping it and so on. My idea is prevent mmaping first, then allow VFIO device into CC dev (TDI). > > So, I would take this out and defer VFIO enforcment to a series which > does fuller CC enablement of VFIO. > > The precursor work should just be avoiding requiring a VMA when > installing VFIO MMIO into the KVM and IOMMU stage 2 mappings. Ie by > using a FD to get the CPU pfns into iommufd and kvm as you are > showing. > > This works just fine for non-CC devices anyhow and is the necessary Yes. It carries out the idea of "KVM maps MMIO resources without firstly mapping into the host" even for normal VM. That's why I think it could be an independent patchset. Thanks, Yilun > building block for making a TDI interface in VFIO. > > Jason