On Wed, Apr 03, 2024, Isaku Yamahata wrote: > On Wed, Apr 03, 2024 at 11:30:21AM -0700, > Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > > On Tue, Mar 19, 2024, Isaku Yamahata wrote: > > > On Wed, Mar 06, 2024 at 06:09:54PM -0800, > > > Isaku Yamahata <isaku.yamahata@xxxxxxxxxxxxxxx> wrote: > > > > > > > On Wed, Mar 06, 2024 at 04:53:41PM -0800, > > > > David Matlack <dmatlack@xxxxxxxxxx> wrote: > > > > > > > > > On 2024-03-01 09:28 AM, isaku.yamahata@xxxxxxxxx wrote: > > > > > > From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx> > > > > > > > > > > > > Implementation: > > > > > > - x86 KVM MMU > > > > > > In x86 KVM MMU, I chose to use kvm_mmu_do_page_fault(). It's not confined to > > > > > > KVM TDP MMU. We can restrict it to KVM TDP MMU and introduce an optimized > > > > > > version. > > > > > > > > > > Restricting to TDP MMU seems like a good idea. But I'm not quite sure > > > > > how to reliably do that from a vCPU context. Checking for TDP being > > > > > enabled is easy, but what if the vCPU is in guest-mode? > > > > > > > > As you pointed out in other mail, legacy KVM MMU support or guest-mode will be > > > > troublesome. > > > > Why is shadow paging troublesome? I don't see any obvious issues with effectively > > prefetching into a shadow MMU with read fault semantics. It might be pointless > > and wasteful, as the guest PTEs need to be in place, but that's userspace's problem. > > The populating address for shadow paging is GVA, not GPA. I'm not sure if > that's what the user space wants. If it's user-space problem, I'm fine. /facepalm > > Pre-populating is the primary use case, but that could happen if L2 is active, > > e.g. after live migration. > > > > I'm not necessarily opposed to initially adding support only for the TDP MMU, but > > if the delta to also support the shadow MMU is relatively small, my preference > > would be to add the support right away. E.g. to give us confidence that the uAPI > > can work for multiple MMUs, and so that we don't have to write documentation for > > x86 to explain exactly when it's legal to use the ioctl(). > > If we call kvm_mmu.page_fault() without caring of what address will be > populated, I don't see the big difference. Ignore me, I completely spaced that shadow MMUs don't operate on an L1 GPA. I 100% agree that restricting this to TDP, at least for the initial merge, is the way to go. A uAPI where the type of address varies based on the vCPU mode and MMU type would be super ugly, and probably hard to use. At that point, I don't have a strong preference as to whether or not direct legacy/shadow MMUs are supported. That said, I think it can (probably should?) be done in a way where it more or less Just Works, e.g. by having a function hook in "struct kvm_mmu".