Re: [PATCH v8 9/9] KVM: VMX: enable IPI virtualization

Zeng Guang <guang.zeng@xxxxxxxxx> · Mon, 18 Apr 2022 20:49:05 +0800

On 4/15/2022 11:25 PM, Sean Christopherson wrote:
On Mon, Apr 11, 2022, Zeng Guang wrote:

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d1a39285deab..23fbf52f7bea 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11180,11 +11180,15 @@ static int sync_regs(struct kvm_vcpu *vcpu)
  
  int kvm_arch_vcpu_precreate(struct kvm *kvm, unsigned int id)
  {
+	int ret = 0;
+
  	if (kvm_check_tsc_unstable() && atomic_read(&kvm->online_vcpus) != 0)
  		pr_warn_once("kvm: SMP vm created on host with unstable TSC; "
  			     "guest TSC will not be reliable\n");
  
-	return 0;
+	if (kvm_x86_ops.alloc_ipiv_pid_table)
+		ret = static_call(kvm_x86_alloc_ipiv_pid_table)(kvm);
Add a generic kvm_x86_ops.vcpu_precreate, no reason to make this so specific.
And use KVM_X86_OP_RET0 instead of KVM_X86_OP_OPTIONAL, then this can simply be

	return static_call(kvm_x86_vcpu_precreate);

That said, there's a flaw in my genius plan.

   1. KVM_CREATE_VM
   2. KVM_CAP_MAX_VCPU_ID, set max_vcpu_ids=1
   3. KVM_CREATE_VCPU, create IPIv table but ultimately fails
   4. KVM decrements created_vcpus back to '0'
   5. KVM_CAP_MAX_VCPU_ID, set max_vcpu_ids=4096
   6. KVM_CREATE_VCPU w/ ID out of range

In other words, malicious userspace could trigger buffer overflow.


This is the tricky exploit that make max_vcpu_ids update more times. I 
think we
can avoid this issue by checking pid table availability during the 
pre-creation
of the first vCPU. If it's already allocated, free it and re-allocate to 
accommodate
table size to new max_vcpu_ids if updated.

That could be solved by adding an arch hook to undo precreate, but that's gross
and a good indication that we're trying to solve this the wrong way.

I think it's high time we add KVM_FINALIZE_VM, though that's probably a bad name
since e.g. TDX wants to use that name for VM really, really, being finalized[*],
i.e. after all vCPUs have been created.

KVM_POST_CREATE_VM?  That's not very good either.

Paolo or anyone else, thoughts?

[*] https://lore.kernel.org/all/83768bf0f786d24f49d9b698a45ba65441ef5ef0.1646422845.git.isaku.yamahata@xxxxxxxxx