Re: [PATCH] KVM: SVM: Fix SNP AP destroy race with VMRUN

Tom Lendacky <thomas.lendacky@xxxxxxx> · Tue, 18 Mar 2025 08:47:36 -0500

On 3/18/25 07:43, Tom Lendacky wrote:
> On 3/17/25 12:36, Tom Lendacky wrote:
>> On 3/17/25 12:28, Sean Christopherson wrote:
>>> On Mon, Mar 17, 2025, Tom Lendacky wrote:
>>>> On 3/17/25 12:20, Tom Lendacky wrote:
>>>>> An AP destroy request for a target vCPU is typically followed by an
>>>>> RMPADJUST to remove the VMSA attribute from the page currently being
>>>>> used as the VMSA for the target vCPU. This can result in a vCPU that
>>>>> is about to VMRUN to exit with #VMEXIT_INVALID.
>>>>>
>>>>> This usually does not happen as APs are typically sitting in HLT when
>>>>> being destroyed and therefore the vCPU thread is not running at the time.
>>>>> However, if HLT is allowed inside the VM, then the vCPU could be about to
>>>>> VMRUN when the VMSA attribute is removed from the VMSA page, resulting in
>>>>> a #VMEXIT_INVALID when the vCPU actually issues the VMRUN and causing the
>>>>> guest to crash. An RMPADJUST against an in-use (already running) VMSA
>>>>> results in a #NPF for the vCPU issuing the RMPADJUST, so the VMSA
>>>>> attribute cannot be changed until the VMRUN for target vCPU exits. The
>>>>> Qemu command line option '-overcommit cpu-pm=on' is an example of allowing
>>>>> HLT inside the guest.
>>>>>
>>>>> Use kvm_test_request() to ensure that the target vCPU sees the AP destroy
>>>>> request before returning to the initiating vCPU.
>>>>>
>>>>> Fixes: e366f92ea99e ("KVM: SEV: Support SEV-SNP AP Creation NAE event")
>>>>> Signed-off-by: Tom Lendacky <thomas.lendacky@xxxxxxx>
>>>>
>>>> Sean,
>>>>
>>>> If you're ok with this approach for the fix, this patch may need to be
>>>> adjusted given your series around AP creation fixes, unless you want to
>>>> put this as an early patch in your series. Let me know what you'd like
>>>> to do.
>>>
>>> This is unsafe as it requires userspace to do KVM_RUN _and_ for the vCPU to get
>>> far enough along to consume the request.
>>>
>>> Very off-the-cuff, but I assume KVM_REQ_UPDATE_PROTECTED_GUEST_STATE just needs
>>> to be annotated with KVM_REQUEST_WAIT.
>>
>> Ok, nice. I wasn't sure if KVM_REQUEST_WAIT would be appropriate here.
>> This is much simpler. Let me test it out and resend if everything goes ok.
> 
> So that doesn't work. I can still get an occasional #VMEXIT_INVALID. Let
> me try to track down what is happening with this approach...

Looks like I need to use kvm_make_vcpus_request_mask() instead of just a
plain kvm_make_request() followed by a kvm_vcpu_kick().

Let me try that and see how this works.

Thanks,
Tom

> 
> Thanks,
> Tom
> 
>>
>> Thanks,
>> Tom
>>
>>>
>>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>>> index 04e6c5604bc3..67abfe97c600 100644
>>> --- a/arch/x86/include/asm/kvm_host.h
>>> +++ b/arch/x86/include/asm/kvm_host.h
>>> @@ -124,7 +124,8 @@
>>>         KVM_ARCH_REQ_FLAGS(31, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
>>>  #define KVM_REQ_HV_TLB_FLUSH \
>>>         KVM_ARCH_REQ_FLAGS(32, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
>>> -#define KVM_REQ_UPDATE_PROTECTED_GUEST_STATE   KVM_ARCH_REQ(34)
>>> +#define KVM_REQ_UPDATE_PROTECTED_GUEST_STATE \
>>> +       KVM_ARCH_REQ_FLAGS(34, KVM_REQUEST_WAIT)
>>>  
>>>  #define CR0_RESERVED_BITS                                               \
>>>         (~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
>>>
>>>