On 01/08/2017 23:00, David Matlack wrote: > According to the Intel SDM, software cannot rely on the current VMCS to be > coherent after a VMXOFF or shutdown. So this is a valid way to handle VMCS12 > flushes. > > 24.11.1 Software Use of Virtual-Machine Control Structures > ... > If a logical processor leaves VMX operation, any VMCSs active on > that logical processor may be corrupted (see below). To prevent > such corruption of a VMCS that may be used either after a return > to VMX operation or on another logical processor, software should > execute VMCLEAR for that VMCS before executing the VMXOFF instruction > or removing power from the processor (e.g., as part of a transition > to the S3 and S4 power states). > ... > > This fixes a "suspicious rcu_dereference_check() usage!" warning during > kvm_vm_release() because nested_release_vmcs12() calls > kvm_vcpu_write_guest_page() without holding kvm->srcu. > > Signed-off-by: David Matlack <dmatlack@xxxxxxxxxx> > --- > This patch applies on top of Paolo's "[PATCH] KVM: nVMX: do not pin the VMCS12". > (http://www.mail-archive.com/linux-kernel@xxxxxxxxxxxxxxx/msg1455166.html) Thanks, I think Radim should first apply the RCU-on-teardown patch (which I'll resend formally today), then "do not pin the VMCS12", then these two. Reviewed-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > > arch/x86/kvm/vmx.c | 16 +++++++++++----- > 1 file changed, 11 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index 5c03340f7827..07d2198db225 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -419,7 +419,7 @@ struct nested_vmx { > /* > * Cache of the guest's VMCS, existing outside of guest memory. > * Loaded from guest memory during VMPTRLD. Flushed to guest > - * memory during VMXOFF, VMCLEAR, VMPTRLD. > + * memory during VMCLEAR and VMPTRLD. > */ > struct vmcs12 *cached_vmcs12; > /* > @@ -7131,6 +7131,12 @@ static int nested_vmx_check_permission(struct kvm_vcpu *vcpu) > return 1; > } > > +static void vmx_disable_shadow_vmcs(struct vcpu_vmx *vmx) > +{ > + vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL, SECONDARY_EXEC_SHADOW_VMCS); > + vmcs_write64(VMCS_LINK_POINTER, -1ull); > +} > + > static inline void nested_release_vmcs12(struct vcpu_vmx *vmx) > { > if (vmx->nested.current_vmptr == -1ull) > @@ -7141,9 +7147,7 @@ static inline void nested_release_vmcs12(struct vcpu_vmx *vmx) > they were modified */ > copy_shadow_to_vmcs12(vmx); > vmx->nested.sync_shadow_vmcs = false; > - vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL, > - SECONDARY_EXEC_SHADOW_VMCS); > - vmcs_write64(VMCS_LINK_POINTER, -1ull); > + vmx_disable_shadow_vmcs(vmx); > } > vmx->nested.posted_intr_nv = -1; > > @@ -7166,12 +7170,14 @@ static void free_nested(struct vcpu_vmx *vmx) > > vmx->nested.vmxon = false; > free_vpid(vmx->nested.vpid02); > - nested_release_vmcs12(vmx); > + vmx->nested.posted_intr_nv = -1; > + vmx->nested.current_vmptr = -1ull; > if (vmx->nested.msr_bitmap) { > free_page((unsigned long)vmx->nested.msr_bitmap); > vmx->nested.msr_bitmap = NULL; > } > if (enable_shadow_vmcs) { > + vmx_disable_shadow_vmcs(vmx); > vmcs_clear(vmx->vmcs01.shadow_vmcs); > free_vmcs(vmx->vmcs01.shadow_vmcs); > vmx->vmcs01.shadow_vmcs = NULL; >