On Tue, Apr 16, 2024, Paolo Bonzini wrote: > On Mon, Apr 15, 2024 at 11:17 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > The first question to answer is, do we want to return an error or "silently" > > install mappings for !SMM, !guest_mode. And so this option becomes relevant only > > _if_ we want to unconditionally install mappings for the 'base" mode. > > > > > > - Return error on guest mode or SMM mode: Without this patch. > > > > Pros: No additional patch. > > > > Cons: Difficult to use. > > > > > > Hmm... For the non-TDX use cases this is just an optimization, right? For TDX > > > there shouldn't be an issue. If so, maybe this last one is not so horrible. > > It doesn't even have to be ABI that it gives an error. As you say, > this ioctl can just be advisory only for !confidential machines. Even > if it were implemented, the shadow MMU can drop roots at any moment Sure, but there's a difference between KVM _potentially_ dropping roots and guaranteed failure because userspace is trying to do something that's unsupported. But I think this is a non-issue, because it should really just be as simple as: if (!mmu->pre_map_memory) return -EOPNOTSUPP; Hmm, or probably this to avoid adding an MMU hook for a single MMU flavor: if (!tdp_mmu_enabled || !mmu->root_role.direct) return -EOPNOTSUPP; > and/or kill the mapping via the shrinker. Ugh, we really need to kill that code. > That said, I can't fully shake the feeling that this ioctl should be > an error for !TDX and that TDX_INIT_MEM_REGION wasn't that bad. The > implementation was ugly but the API was fine. Hmm, but IMO the implementation was ugly in no small part because of the contraints put on KVM by the API. Mapping S-EPT *and* doing TDH.MEM.PAGE.ADD in the same ioctl() forced KVM to operate on vcpu0, and necessitated shoving temporary data into a per-VM structure in order to get the source contents into TDH.MEM.PAGE.ADD. We could eliminate the vcpu0 grossness, but it would require a massive refactor, which is also not a problem per se, but it's obviously not free. Eliminating kvm_tdx.source_page is also doable, but it's not clear to me that end result would be a net positive. If userspace pre-maps the S-EPT entries ahead of time, then KVM should have a straight shot to PAGE.ADD, i.e. doesn't need to "pass" the source page via a scratch field in kvm_tdx, and I think/hope would avoid the need to grab vcpu0 in order to get at an MMU to build the S-EPT. And stating the obvious, TDX_INIT_MEM_REGION also doesn't allow pre-mapping memory, which is generally useful, and can be especially beneficial for confidential VMs (and TDX in particular) due to the added cost of a page fault VM-Exit. I'm not dead set on this generic ioctl(), but unless it ends up being a train wreck for userspace, I think it will allow for cleaner and more reusable code in KVM.