Re: [PATCH 1/4] kvm: cpuid: adjust the returned nent field of kvm_cpuid2 for KVM_GET_SUPPORTED_CPUID and KVM_GET_EMULATED_CPUID

Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> · Wed, 31 Mar 2021 13:25:02 +0200

Emanuele Giuseppe Esposito <eesposit@xxxxxxxxxx> writes:

> On 31/03/2021 09:56, Vitaly Kuznetsov wrote:
>> Emanuele Giuseppe Esposito <eesposit@xxxxxxxxxx> writes:
>> 
>>> On 31/03/2021 05:01, Sean Christopherson wrote:
>>>> On Tue, Mar 30, 2021, Emanuele Giuseppe Esposito wrote:
>>>>> Calling the kvm KVM_GET_[SUPPORTED/EMULATED]_CPUID ioctl requires
>>>>> a nent field inside the kvm_cpuid2 struct to be big enough to contain
>>>>> all entries that will be set by kvm.
>>>>> Therefore if the nent field is too high, kvm will adjust it to the
>>>>> right value. If too low, -E2BIG is returned.
>>>>>
>>>>> However, when filling the entries do_cpuid_func() requires an
>>>>> additional entry, so if the right nent is known in advance,
>>>>> giving the exact number of entries won't work because it has to be increased
>>>>> by one.
>>>>>
>>>>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@xxxxxxxxxx>
>>>>> ---
>>>>>    arch/x86/kvm/cpuid.c | 6 ++++++
>>>>>    1 file changed, 6 insertions(+)
>>>>>
>>>>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>>>>> index 6bd2f8b830e4..5412b48b9103 100644
>>>>> --- a/arch/x86/kvm/cpuid.c
>>>>> +++ b/arch/x86/kvm/cpuid.c
>>>>> @@ -975,6 +975,12 @@ int kvm_dev_ioctl_get_cpuid(struct kvm_cpuid2 *cpuid,
>>>>>    
>>>>>    	if (cpuid->nent < 1)
>>>>>    		return -E2BIG;
>>>>> +
>>>>> +	/* if there are X entries, we need to allocate at least X+1
>>>>> +	 * entries but return the actual number of entries
>>>>> +	 */
>>>>> +	cpuid->nent++;
>>>>
>>>> I don't see how this can be correct.
>>>>
>>>> If this bonus entry really is needed, then won't that be reflected in array.nent?
>>>> I.e won't KVM overrun the userspace buffer?
>>>>
>>>> If it's not reflected in array.nent, that would imply there's an off-by-one check
>>>> somewhere, or KVM is creating an entry that it doesn't copy to userspace.  The
>>>> former seems unlikely as there are literally only two checks against maxnent,
>>>> and they both look correct (famous last words...).
>>>>
>>>> KVM does decrement array->nent in one specific case (CPUID.0xD.2..64), i.e. a
>>>> false positive is theoretically possible, but that carries a WARN and requires a
>>>> kernel or CPU bug as well.  And fudging nent for that case would still break
>>>> normal use cases due to the overrun problem.
>>>>
>>>> What am I missing?
>>>
>>> (Maybe I should have put this series as RFC)
>>>
>>> The problem I see and noticed while doing the KVM_GET_EMULATED_CPUID
>>> selftest is the following: assume there are 3 kvm emulated entries, and
>>> the user sets cpuid->nent = 3. This should work because kvm sets 3
>>> array->entries[], and copies them to user space.
>>>
>>> However, when the 3rd entry is populated inside kvm (array->entries[2]),
>>> array->nent is increased once more (do_host_cpuid and
>>> __do_cpuid_func_emulated). At that point, the loop in
>>> kvm_dev_ioctl_get_cpuid and get_cpuid_func can potentially iterate once
>>> more, going into the
>>>
>>> if (array->nent >= array->maxnent)
>>> 	return -E2BIG;
>>>
>>> in __do_cpuid_func_emulated and do_host_cpuid, returning the error. I
>>> agree that we need that check there because the following code tries to
>>> access the array entry at array->nent index, but from what I understand
>>> that access can be potentially useless because it might just jump to the
>>> default entry in the switch statement and not set the entry, leaving
>>> array->nent to 3.
>> 
>> The problem seems to be exclusive to __do_cpuid_func_emulated(),
>> do_host_cpuid() always does
>> 
>> entry = &array->entries[array->nent++];
>> 
>> Something like (completely untested and stupid):
>> 
>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>> index 6bd2f8b830e4..54dcabd3abec 100644
>> --- a/arch/x86/kvm/cpuid.c
>> +++ b/arch/x86/kvm/cpuid.c
>> @@ -565,14 +565,22 @@ static struct kvm_cpuid_entry2 *do_host_cpuid(struct kvm_cpuid_array *array,
>>          return entry;
>>   }
>>   
>> +static bool cpuid_func_emulated(u32 func)
>> +{
>> +       return (func == 0) || (func == 1) || (func == 7);
>> +}
>> +
>>   static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func)
>>   {
>>          struct kvm_cpuid_entry2 *entry;
>>   
>> +       if (!cpuid_func_emulated())
>> +               return 0;
>> +
>>          if (array->nent >= array->maxnent)
>>                  return -E2BIG;
>>   
>> -       entry = &array->entries[array->nent];
>> +       entry = &array->entries[array->nent++];
>>          entry->function = func;
>>          entry->index = 0;
>>          entry->flags = 0;
>> @@ -580,18 +588,14 @@ static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func)
>>          switch (func) {
>>          case 0:
>>                  entry->eax = 7;
>> -               ++array->nent;
>>                  break;
>>          case 1:
>>                  entry->ecx = F(MOVBE);
>> -               ++array->nent;
>>                  break;
>>          case 7:
>>                  entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
>>                  entry->eax = 0;
>>                  entry->ecx = F(RDPID);
>> -               ++array->nent;
>> -       default:
>>                  break;
>>          }
>> 
>> should do the job, right?
>> 
>> 
>
> Yes, it would work better. Alternatively:
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index ba7437308d28..452b0acd6e9d 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -567,34 +567,37 @@ static struct kvm_cpuid_entry2 
> *do_host_cpuid(struct kvm_cpuid_array *array,
>
>   static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 
> func)
>   {
> -	struct kvm_cpuid_entry2 *entry;
> -
> -	if (array->nent >= array->maxnent)
> -		return -E2BIG;
> +	struct kvm_cpuid_entry2 entry;
> +	bool changed = true;
>
> -	entry = &array->entries[array->nent];
> -	entry->function = func;
> -	entry->index = 0;
> -	entry->flags = 0;
> +	entry.function = func;
> +	entry.index = 0;
> +	entry.flags = 0;
>
>   	switch (func) {
>   	case 0:
> -		entry->eax = 7;
> -		++array->nent;
> +		entry.eax = 7;
>   		break;
>   	case 1:
> -		entry->ecx = F(MOVBE);
> -		++array->nent;
> +		entry.ecx = F(MOVBE);
>   		break;
>   	case 7:
> -		entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
> -		entry->eax = 0;
> -		entry->ecx = F(RDPID);
> -		++array->nent;
> +		entry.flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
> +		entry.eax = 0;
> +		entry.ecx = F(RDPID);
> +		break;
>   	default:
> +		changed = false;
>   		break;
>   	}
>
> +	if (changed) {
> +		if (array->nent >= array->maxnent)
> +			return -E2BIG;
> +
> +		memcpy(&array->entries[array->nent++], &entry, sizeof(entry));
> +	}
> +
>   	return 0;
>   }
>
> pros: avoids hard-coding another function that would check what the 
> switch already does. it will be more flexible if another func has to be 
> added. cons: there is a memcpy for each entry.

Looks good to me,

I'd drop just 'bool changed' and replaced it with 'goto out' in the
'default' case.

memcpy() here is not a problem I believe, this path is not that
performace critical.

-- 
Vitaly