Re: [PATCH 1/4] kvm: cpuid: adjust the returned nent field of kvm_cpuid2 for KVM_GET_SUPPORTED_CPUID and KVM_GET_EMULATED_CPUID

Emanuele Giuseppe Esposito <eesposit@xxxxxxxxxx> · Wed, 31 Mar 2021 12:07:14 +0200

On 31/03/2021 09:56, Vitaly Kuznetsov wrote:
Emanuele Giuseppe Esposito <eesposit@xxxxxxxxxx> writes:

On 31/03/2021 05:01, Sean Christopherson wrote:
On Tue, Mar 30, 2021, Emanuele Giuseppe Esposito wrote:
Calling the kvm KVM_GET_[SUPPORTED/EMULATED]_CPUID ioctl requires
a nent field inside the kvm_cpuid2 struct to be big enough to contain
all entries that will be set by kvm.
Therefore if the nent field is too high, kvm will adjust it to the
right value. If too low, -E2BIG is returned.

However, when filling the entries do_cpuid_func() requires an
additional entry, so if the right nent is known in advance,
giving the exact number of entries won't work because it has to be increased
by one.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@xxxxxxxxxx>
---
   arch/x86/kvm/cpuid.c | 6 ++++++
   1 file changed, 6 insertions(+)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 6bd2f8b830e4..5412b48b9103 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -975,6 +975,12 @@ int kvm_dev_ioctl_get_cpuid(struct kvm_cpuid2 *cpuid,
   
   	if (cpuid->nent < 1)
   		return -E2BIG;
+
+	/* if there are X entries, we need to allocate at least X+1
+	 * entries but return the actual number of entries
+	 */
+	cpuid->nent++;

I don't see how this can be correct.

If this bonus entry really is needed, then won't that be reflected in array.nent?
I.e won't KVM overrun the userspace buffer?

If it's not reflected in array.nent, that would imply there's an off-by-one check
somewhere, or KVM is creating an entry that it doesn't copy to userspace.  The
former seems unlikely as there are literally only two checks against maxnent,
and they both look correct (famous last words...).

KVM does decrement array->nent in one specific case (CPUID.0xD.2..64), i.e. a
false positive is theoretically possible, but that carries a WARN and requires a
kernel or CPU bug as well.  And fudging nent for that case would still break
normal use cases due to the overrun problem.

What am I missing?

(Maybe I should have put this series as RFC)

The problem I see and noticed while doing the KVM_GET_EMULATED_CPUID
selftest is the following: assume there are 3 kvm emulated entries, and
the user sets cpuid->nent = 3. This should work because kvm sets 3
array->entries[], and copies them to user space.

However, when the 3rd entry is populated inside kvm (array->entries[2]),
array->nent is increased once more (do_host_cpuid and
__do_cpuid_func_emulated). At that point, the loop in
kvm_dev_ioctl_get_cpuid and get_cpuid_func can potentially iterate once
more, going into the

if (array->nent >= array->maxnent)
	return -E2BIG;

in __do_cpuid_func_emulated and do_host_cpuid, returning the error. I
agree that we need that check there because the following code tries to
access the array entry at array->nent index, but from what I understand
that access can be potentially useless because it might just jump to the
default entry in the switch statement and not set the entry, leaving
array->nent to 3.

The problem seems to be exclusive to __do_cpuid_func_emulated(),
do_host_cpuid() always does

entry = &array->entries[array->nent++];

Something like (completely untested and stupid):

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 6bd2f8b830e4..54dcabd3abec 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -565,14 +565,22 @@ static struct kvm_cpuid_entry2 *do_host_cpuid(struct kvm_cpuid_array *array,
         return entry;
  }
  
+static bool cpuid_func_emulated(u32 func)
+{
+       return (func == 0) || (func == 1) || (func == 7);
+}
+
  static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func)
  {
         struct kvm_cpuid_entry2 *entry;
  
+       if (!cpuid_func_emulated())
+               return 0;
+
         if (array->nent >= array->maxnent)
                 return -E2BIG;
  
-       entry = &array->entries[array->nent];
+       entry = &array->entries[array->nent++];
         entry->function = func;
         entry->index = 0;
         entry->flags = 0;
@@ -580,18 +588,14 @@ static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func)
         switch (func) {
         case 0:
                 entry->eax = 7;
-               ++array->nent;
                 break;
         case 1:
                 entry->ecx = F(MOVBE);
-               ++array->nent;
                 break;
         case 7:
                 entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
                 entry->eax = 0;
                 entry->ecx = F(RDPID);
-               ++array->nent;
-       default:
                 break;
         }

should do the job, right?



Yes, it would work better. Alternatively:

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index ba7437308d28..452b0acd6e9d 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -567,34 +567,37 @@ static struct kvm_cpuid_entry2 
*do_host_cpuid(struct kvm_cpuid_array *array,

 static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 
func)
 {
-	struct kvm_cpuid_entry2 *entry;
-
-	if (array->nent >= array->maxnent)
-		return -E2BIG;
+	struct kvm_cpuid_entry2 entry;
+	bool changed = true;

-	entry = &array->entries[array->nent];
-	entry->function = func;
-	entry->index = 0;
-	entry->flags = 0;
+	entry.function = func;
+	entry.index = 0;
+	entry.flags = 0;

 	switch (func) {
 	case 0:
-		entry->eax = 7;
-		++array->nent;
+		entry.eax = 7;
 		break;
 	case 1:
-		entry->ecx = F(MOVBE);
-		++array->nent;
+		entry.ecx = F(MOVBE);
 		break;
 	case 7:
-		entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
-		entry->eax = 0;
-		entry->ecx = F(RDPID);
-		++array->nent;
+		entry.flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
+		entry.eax = 0;
+		entry.ecx = F(RDPID);
+		break;
 	default:
+		changed = false;
 		break;
 	}

+	if (changed) {
+		if (array->nent >= array->maxnent)
+			return -E2BIG;
+
+		memcpy(&array->entries[array->nent++], &entry, sizeof(entry));
+	}
+
 	return 0;
 }

pros: avoids hard-coding another function that would check what the 
switch already does. it will be more flexible if another func has to be 
added.
cons: there is a memcpy for each entry.

What do you think?

Emanuele