On Tue, Nov 17, 2020 at 09:33:17AM -0600, Tom Lendacky wrote: > On 11/16/20 5:20 PM, Jason Gunthorpe wrote: > > On Mon, Nov 16, 2020 at 03:43:53PM -0600, Tom Lendacky wrote: > >> On 11/16/20 9:53 AM, Jason Gunthorpe wrote: > >>> On Thu, Nov 05, 2020 at 06:39:49PM -0500, Peter Xu wrote: > >>>> On Thu, Nov 05, 2020 at 12:34:58PM -0400, Jason Gunthorpe wrote: > >>>>> Tom says VFIO device assignment works OK with KVM, so I expect only things > >>>>> like DPDK to be broken. > >>>> > >>>> Is there more information on why the difference? Thanks, > >>> > >>> I have nothing, maybe Tom can explain how it works? > >> > >> IIUC, the main differences would be along the lines of what is performing > >> the mappings or who is performing the MMIO. > >> > >> For device passthrough using VFIO, the guest kernel is the one that ends > >> up performing the MMIO in kernel space with the proper encryption mask > >> (unencrypted). > > > > The question here is why does VF assignment work if the MMIO mapping > > in the hypervisor is being marked encrypted. > > > > It sounds like this means the page table in the hypervisor is ignored, > > and it works because the VM's kernel marks the guest's page table as > > non-encrypted? > > If I understand the VFIO code correctly, the MMIO area gets registered as > a RAM memory region and added to the guest. This MMIO region is accessed > in the guest through ioremap(), which creates an un-encrypted mapping, > allowing the guest to read it properly. So I believe the mmap() call only > provides the information used to register the memory region for guest > access and is not directly accessed by Qemu (I don't believe the guest > VMEXITs for the MMIO access, but I could be wrong). Thanks for the explanations. It seems fine if two dimentional page table is used in kvm, as long as the 1st level guest page table is handled the same way as in the host. I'm thinking what if shadow page table is used - IIUC here the vfio mmio region will be the same as normal guest RAM from kvm memslot pov, however if the mmio region is not encrypted, does it also mean that the whole guest RAM is not encrypted too? It's a pure question because I feel like these are two layers of security (host as the 1st, guest as the 2nd), maybe here we're only talking about host security rather than the guests, then it looks fine too. -- Peter Xu