Re: SRCU dereference check warning with SEV-ES

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/5/21 11:16 AM, Paolo Bonzini wrote:
> On 05/05/21 16:01, Tom Lendacky wrote:
>> Boris noticed the below warning when running an SEV-ES guest with
>> CONFIG_PROVE_LOCKING=y.
>>
>> The SRCU lock is released before invoking the vCPU run op where the SEV-ES
>> support will unmap the GHCB. Is the proper thing to do here to take the
>> SRCU lock around the call to kvm_vcpu_unmap() in this case? It does fix
>> the issue, but I just want to be sure that I shouldn't, instead, be taking
>> the memslot lock:
> 
> I would rather avoid having long-lived maps, as I am working on removing
> them from the Intel code.  However, it seems to me that the GHCB is almost
> not used after sev_handle_vmgexit returns?

Except for as you pointed out below, things like MMIO and IO. Anything
that has to exit to userspace to complete will still need the GHCB mapped
when coming back into the kernel if the shared buffer area of the GHCB is
being used.

Btw, what do you consider long lived maps?  Is having a map while context
switching back to userspace considered long lived? The GHCB will
(possibly) only be mapped on VMEXIT (VMGEXIT) and unmapped on the next
VMRUN for the vCPU. An AP reset hold could be a while, though.

> 
> If so, there's no need to keep it mapped until pre_sev_es_run:
> 
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index a9d8d6aafdb8..b2226a5e249d 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -2200,9 +2200,6 @@ static int sev_es_validate_vmgexit(struct vcpu_svm
> *svm)
>  
>  static void pre_sev_es_run(struct vcpu_svm *svm)
>  {
> -    if (!svm->ghcb)
> -        return;
> -
>      if (svm->ghcb_sa_free) {
>          /*
>           * The scratch area lives outside the GHCB, so there is a
> @@ -2220,13 +2217,6 @@ static void pre_sev_es_run(struct vcpu_svm *svm)
>          svm->ghcb_sa = NULL;
>          svm->ghcb_sa_free = false;
>      }
> -
> -    trace_kvm_vmgexit_exit(svm->vcpu.vcpu_id, svm->ghcb);
> -
> -    sev_es_sync_to_ghcb(svm);
> -
> -    kvm_vcpu_unmap(&svm->vcpu, &svm->ghcb_map, true);
> -    svm->ghcb = NULL;
>  }
>  
>  void pre_sev_run(struct vcpu_svm *svm, int cpu)
> @@ -2465,7 +2455,7 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
>  
>      ret = sev_es_validate_vmgexit(svm);
>      if (ret)
> -        return ret;
> +        goto out_unmap;
>  
>      sev_es_sync_from_ghcb(svm);
>      ghcb_set_sw_exit_info_1(ghcb, 0);
> @@ -2485,6 +2485,7 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
>          ret = svm_invoke_exit_handler(vcpu, SVM_EXIT_IRET);
>          break;
>      case SVM_VMGEXIT_AP_HLT_LOOP:
> +        ghcb_set_sw_exit_info_2(svm->ghcb, 1);
>          ret = kvm_emulate_ap_reset_hold(vcpu);
>          break;
>      case SVM_VMGEXIT_AP_JUMP_TABLE: {
> @@ -2531,6 +2521,11 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
>          ret = svm_invoke_exit_handler(vcpu, exit_code);
>      }
>  
> +    sev_es_sync_to_ghcb(svm);
> +
> +out_unmap:
> +    kvm_vcpu_unmap(&svm->vcpu, &svm->ghcb_map, true);
> +    svm->ghcb = NULL;
>      return ret;
>  }
>  
> @@ -2619,21 +2620,4 @@ void sev_es_prepare_guest_switch(struct vcpu_svm
> *svm, unsigned int cpu)
>  
>  void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
>  {
> -    struct vcpu_svm *svm = to_svm(vcpu);
> -
> -    /* First SIPI: Use the values as initially set by the VMM */
> -    if (!svm->received_first_sipi) {
> -        svm->received_first_sipi = true;
> -        return;
> -    }
> -
> -    /*
> -     * Subsequent SIPI: Return from an AP Reset Hold VMGEXIT, where
> -     * the guest will set the CS and RIP. Set SW_EXIT_INFO_2 to a
> -     * non-zero value.
> -     */
> -    if (!svm->ghcb)
> -        return;
> -
> -    ghcb_set_sw_exit_info_2(svm->ghcb, 1);
>  }
> 
> However:
> 
> 1) I admit I got lost in the maze starting with sev_es_string_io

Yeah, kvm_sev_es_string_io bypasses the instruction emulation that would
normally be done to get register values and information about the port I/O
and supplies those directly. But in general, it follows the same exiting
to userspace to complete the port I/O operation.

> 
> 2) upon an AP reset hold exit, the above patch sets the EXITINFO2 field
> before the SIPI was received.  My understanding is that the processor will
> not see the value anyway until it resumes execution, and why would other
> vCPUs muck with the AP's GHCB.  But I'm not sure if it's okay.

As long as the vCPU might not be woken up for any reason other than a
SIPI, you can get a way with this. But if it was to be woken up for some
other reason (an IPI for some reason?), then you wouldn't want the
non-zero value set in the GHCB in advance, because that might cause the
vCPU to exit the HLT loop it is in waiting for the actual SIPI.

Thanks,
Tom

> 
> Paolo
> 



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux