On Wed, 2024-10-09 at 10:50 -0700, Sean Christopherson wrote: > When lockdep is enabled, assert that KVM accesses the register caches if > and only if cache fills are guaranteed to consume fresh data, i.e. when > KVM when KVM is in control of the code sequence. Concretely, the caches > can only be used from task context (synchronous) or when handling a PMI > VM-Exit (asynchronous, but only in specific windows where the caches are > in a known, stable state). > > Generally speaking, there are very few flows where reading register state > from an asynchronous context is correct or even necessary. So, rather > than trying to figure out a generic solution, simply disallow using the > caches outside of task context by default, and deal with any future > exceptions on a case-by-case basis _if_ they arise. > > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> > --- > arch/x86/kvm/kvm_cache_regs.h | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/arch/x86/kvm/kvm_cache_regs.h b/arch/x86/kvm/kvm_cache_regs.h > index b1eb46e26b2e..36a8786db291 100644 > --- a/arch/x86/kvm/kvm_cache_regs.h > +++ b/arch/x86/kvm/kvm_cache_regs.h > @@ -43,6 +43,18 @@ BUILD_KVM_GPR_ACCESSORS(r14, R14) > BUILD_KVM_GPR_ACCESSORS(r15, R15) > #endif > > +/* > + * Using the register cache from interrupt context is generally not allowed, as > + * caching a register and marking it available/dirty can't be done atomically, > + * i.e. accesses from interrupt context may clobber state or read stale data if > + * the vCPU task is in the process of updating the cache. The exception is if > + * KVM is handling a PMI IRQ/NMI VM-Exit, as that bound code sequence doesn't > + * touch the cache, it runs after the cache is reset (post VM-Exit), and PMIs > + * need to access several registers that are cacheable. > + */ > +#define kvm_assert_register_caching_allowed(vcpu) \ > + lockdep_assert_once(in_task() || kvm_arch_pmi_in_guest(vcpu)) > + > /* > * avail dirty > * 0 0 register in VMCS/VMCB > @@ -53,24 +65,28 @@ BUILD_KVM_GPR_ACCESSORS(r15, R15) > static inline bool kvm_register_is_available(struct kvm_vcpu *vcpu, > enum kvm_reg reg) > { > + kvm_assert_register_caching_allowed(vcpu); > return test_bit(reg, (unsigned long *)&vcpu->arch.regs_avail); > } > > static inline bool kvm_register_is_dirty(struct kvm_vcpu *vcpu, > enum kvm_reg reg) > { > + kvm_assert_register_caching_allowed(vcpu); > return test_bit(reg, (unsigned long *)&vcpu->arch.regs_dirty); > } > > static inline void kvm_register_mark_available(struct kvm_vcpu *vcpu, > enum kvm_reg reg) > { > + kvm_assert_register_caching_allowed(vcpu); > __set_bit(reg, (unsigned long *)&vcpu->arch.regs_avail); > } > > static inline void kvm_register_mark_dirty(struct kvm_vcpu *vcpu, > enum kvm_reg reg) > { > + kvm_assert_register_caching_allowed(vcpu); > __set_bit(reg, (unsigned long *)&vcpu->arch.regs_avail); > __set_bit(reg, (unsigned long *)&vcpu->arch.regs_dirty); > } > @@ -84,6 +100,7 @@ static inline void kvm_register_mark_dirty(struct kvm_vcpu *vcpu, > static __always_inline bool kvm_register_test_and_mark_available(struct kvm_vcpu *vcpu, > enum kvm_reg reg) > { > + kvm_assert_register_caching_allowed(vcpu); > return arch___test_and_set_bit(reg, (unsigned long *)&vcpu->arch.regs_avail); > } > Using lockdep for non 100% lockdep purposes is odd, but then these asserts do guard against races, so I guess this is a fair use of this assert. Reviewed-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx> Best regards, Maxim Levitsky