Re: [PATCH v4 3/4] KVM: x86: Add lockdep-guarded asserts on register cache usage

Maxim Levitsky <mlevitsk@xxxxxxxxxx> · Wed, 30 Oct 2024 17:13:23 -0400

On Wed, 2024-10-09 at 10:50 -0700, Sean Christopherson wrote:
> When lockdep is enabled, assert that KVM accesses the register caches if
> and only if cache fills are guaranteed to consume fresh data, i.e. when
> KVM when KVM is in control of the code sequence.  Concretely, the caches
> can only be used from task context (synchronous) or when handling a PMI
> VM-Exit (asynchronous, but only in specific windows where the caches are
> in a known, stable state).
> 
> Generally speaking, there are very few flows where reading register state
> from an asynchronous context is correct or even necessary.  So, rather
> than trying to figure out a generic solution, simply disallow using the
> caches outside of task context by default, and deal with any future
> exceptions on a case-by-case basis _if_ they arise.
> 
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
>  arch/x86/kvm/kvm_cache_regs.h | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/arch/x86/kvm/kvm_cache_regs.h b/arch/x86/kvm/kvm_cache_regs.h
> index b1eb46e26b2e..36a8786db291 100644
> --- a/arch/x86/kvm/kvm_cache_regs.h
> +++ b/arch/x86/kvm/kvm_cache_regs.h
> @@ -43,6 +43,18 @@ BUILD_KVM_GPR_ACCESSORS(r14, R14)
>  BUILD_KVM_GPR_ACCESSORS(r15, R15)
>  #endif
>  
> +/*
> + * Using the register cache from interrupt context is generally not allowed, as
> + * caching a register and marking it available/dirty can't be done atomically,
> + * i.e. accesses from interrupt context may clobber state or read stale data if
> + * the vCPU task is in the process of updating the cache.  The exception is if
> + * KVM is handling a PMI IRQ/NMI VM-Exit, as that bound code sequence doesn't
> + * touch the cache, it runs after the cache is reset (post VM-Exit), and PMIs
> + * need to access several registers that are cacheable.
> + */
> +#define kvm_assert_register_caching_allowed(vcpu)		\
> +	lockdep_assert_once(in_task() || kvm_arch_pmi_in_guest(vcpu))
> +
>  /*
>   * avail  dirty
>   * 0	  0	  register in VMCS/VMCB
> @@ -53,24 +65,28 @@ BUILD_KVM_GPR_ACCESSORS(r15, R15)
>  static inline bool kvm_register_is_available(struct kvm_vcpu *vcpu,
>  					     enum kvm_reg reg)
>  {
> +	kvm_assert_register_caching_allowed(vcpu);
>  	return test_bit(reg, (unsigned long *)&vcpu->arch.regs_avail);
>  }
>  
>  static inline bool kvm_register_is_dirty(struct kvm_vcpu *vcpu,
>  					 enum kvm_reg reg)
>  {
> +	kvm_assert_register_caching_allowed(vcpu);
>  	return test_bit(reg, (unsigned long *)&vcpu->arch.regs_dirty);
>  }
>  
>  static inline void kvm_register_mark_available(struct kvm_vcpu *vcpu,
>  					       enum kvm_reg reg)
>  {
> +	kvm_assert_register_caching_allowed(vcpu);
>  	__set_bit(reg, (unsigned long *)&vcpu->arch.regs_avail);
>  }
>  
>  static inline void kvm_register_mark_dirty(struct kvm_vcpu *vcpu,
>  					   enum kvm_reg reg)
>  {
> +	kvm_assert_register_caching_allowed(vcpu);
>  	__set_bit(reg, (unsigned long *)&vcpu->arch.regs_avail);
>  	__set_bit(reg, (unsigned long *)&vcpu->arch.regs_dirty);
>  }
> @@ -84,6 +100,7 @@ static inline void kvm_register_mark_dirty(struct kvm_vcpu *vcpu,
>  static __always_inline bool kvm_register_test_and_mark_available(struct kvm_vcpu *vcpu,
>  								 enum kvm_reg reg)
>  {
> +	kvm_assert_register_caching_allowed(vcpu);
>  	return arch___test_and_set_bit(reg, (unsigned long *)&vcpu->arch.regs_avail);
>  }
>  

Using lockdep for non 100% lockdep purposes is odd, but then these asserts do
guard against races, so I guess this is a fair use of this assert.

Reviewed-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx>

Best regards,
	Maxim Levitsky