On Wed, Apr 17, 2024 at 1:00 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > > > Hmm... For the non-TDX use cases this is just an optimization, right? For TDX > > > > there shouldn't be an issue. If so, maybe this last one is not so horrible. > > > > It doesn't even have to be ABI that it gives an error. As you say, > > this ioctl can just be advisory only for !confidential machines. Even > > if it were implemented, the shadow MMU can drop roots at any moment > > Sure, but there's a difference between KVM _potentially_ dropping roots and > guaranteed failure because userspace is trying to do something that's unsupported. > But I think this is a non-issue, because it should really just be as simple as: > > if (!mmu->pre_map_memory) > return -EOPNOTSUPP; > > Hmm, or probably this to avoid adding an MMU hook for a single MMU flavor: > > if (!tdp_mmu_enabled || !mmu->root_role.direct) > return -EOPNOTSUPP; > > > and/or kill the mapping via the shrinker. > > Ugh, we really need to kill that code. Ok, so let's add a KVM_CHECK_EXTENSION so that people can check if it's supported. > > That said, I can't fully shake the feeling that this ioctl should be > > an error for !TDX and that TDX_INIT_MEM_REGION wasn't that bad. The > > implementation was ugly but the API was fine. > > Hmm, but IMO the implementation was ugly in no small part because of the contraints > put on KVM by the API. Mapping S-EPT *and* doing TDH.MEM.PAGE.ADD in the same > ioctl() forced KVM to operate on vcpu0, and necessitated shoving temporary data > into a per-VM structure in order to get the source contents into TDH.MEM.PAGE.ADD. That's because it was trying to do two things with a single loop. It's not needed - and in fact KVM_CAP_MEMORY_MAPPING forces userspace to do it in two passes. > And stating the obvious, TDX_INIT_MEM_REGION also doesn't allow pre-mapping memory, > which is generally useful, and can be especially beneficial for confidential VMs > (and TDX in particular) due to the added cost of a page fault VM-Exit. > > I'm not dead set on this generic ioctl(), but unless it ends up being a train wreck > for userspace, I think it will allow for cleaner and more reusable code in KVM. Yes, this ioctl() can stay. Forcing it before adding memory to TDX is ugly, but it's not a blocker. I'll look at it closely and see how far it is from being committable to kvm-coco-queue. Paolo