On Wed, Nov 20, 2019 at 11:28:43AM -0800, Sean Christopherson wrote: > On Wed, Nov 20, 2019 at 02:04:38PM -0500, Derek Yerger wrote: > > > > > Debug patch attached. Hopefully it finds something, it took me an > > > embarassing number of attempts to get correct, I kept screwing up checking > > > a bit number versus checking a bit mask... > > > <0001-thread_info-Add-a-debug-hook-to-detect-FPU-changes-w.patch> > > > > Should this still be tested despite Wanpeng Li’s comments that the issue may > > have been fixed in a 5.3 release candidate? > > Yes. > > The actual bug fix, commit e751732486eb3 (KVM: X86: Fix fpu state crash in > kvm guest), is present in v5.2.7. > > Unless there's a subtlety I'm missing, commit d9a710e5fc4941 (KVM: X86: > Dynamically allocate user_fpu) is purely an optimization and should not > have a functional impact. --- Any chance the below change fixes your issue? It's a bug fix for AVX corruption during signal delivery[*]. It doesn't seem like the same thing you are seeing, but it's worth trying. [*] https://lkml.kernel.org/r/20191127124243.u74osvlkhcmsskng@xxxxxxxxxxxxx/ arch/x86/include/asm/fpu/internal.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h index 4c95c365058aa..44c48e34d7994 100644 --- a/arch/x86/include/asm/fpu/internal.h +++ b/arch/x86/include/asm/fpu/internal.h @@ -509,7 +509,7 @@ static inline void __fpu_invalidate_fpregs_state(struct fpu *fpu) static inline int fpregs_state_valid(struct fpu *fpu, unsigned int cpu) { - return fpu == this_cpu_read_stable(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu; + return fpu == this_cpu_read(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu; }