Re: PROBLEM: Regression of MMU causing guest VM application errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 20 Nov 2019 at 04:03, Sean Christopherson
<sean.j.christopherson@xxxxxxxxx> wrote:
>
> On Wed, Oct 30, 2019 at 11:44:09PM -0400, Derek Yerger wrote:
> >
> > On 10/24/19 1:32 PM, Sean Christopherson wrote:
> > >On Thu, Oct 24, 2019 at 11:18:59AM -0400, Derek Yerger wrote:
> > >>On 10/22/19 4:28 PM, Sean Christopherson wrote:
> > >>>On Thu, Oct 17, 2019 at 07:57:35PM -0400, Derek Yerger wrote:
> > >>>Heh, should've checked from the get go...  It's definitely not the memslot
> > >>>issue, because the memslot bug is in 5.1.16 as well.  :-)
> > >>I didn't pick up on that, nice catch. The memslot thread was the closest
> > >>thing I could find to an educated guess.
> > >>>>I'm stuck on 5.1.x for now, maybe I'll give up and get a dedicated windows
> > >>>>machine /s
> > >>>What hardware are you running on?  I was thinking this was AMD specific,
> > >>>but then realized you said "AMD Radeon 540 GPU" and not "AMD CPU".
> > >>Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
> > >>
> > >>07:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
> > >>Lexa PRO [Radeon 540/540X/550/550X / RX 540X/550/550X] (rev c7)
> > >>         Subsystem: Gigabyte Technology Co., Ltd Device 22fe
> > >>         Kernel driver in use: vfio-pci
> > >>         Kernel modules: amdgpu
> > >>(plus related audio device)
> > >>
> > >>I can't think of any other data points that would be helpful to solving
> > >>system instability in a guest OS.
> > >Can you bisect starting from v5.2?  Identifying which commit in the kernel
> > >introduced the regression would help immensely.
> > On the host, I have to install NVIDIA GPU drivers with each new kernel
> > build. During the process I discovered that I can't reproduce the issue on
> > any kernel if I skip the *host* GPU drivers and start libvirtd in single
> > mode.
> >
> > I noticed the following in the host kernel log around the time the guest
> > encountered BSOD on 5.2.7:
> >
> > [  337.841491] WARNING: CPU: 6 PID: 7548 at arch/x86/kvm/x86.c:7963
> > kvm_arch_vcpu_ioctl_run+0x19b1/0x1b00 [kvm]
>
> Rats, I overlooked this first time round.  In the future, if you get a
> WARN splat, try to make it very obvious in the bug report, they're almost
> always a smoking gun.
>
> That WARN that fired is:
>
>         /* The preempt notifier should have taken care of the FPU already.  */
>         WARN_ON_ONCE(test_thread_flag(TIF_NEED_FPU_LOAD));
>
> which was added part of a bug fix by commit:
>
>         240c35a3783a ("kvm: x86: Use task structs fpu field for user")
>
> the buggy commit that was fixed is
>
>         5f409e20b794 ("x86/fpu: Defer FPU state load until return to userspace")
>
> which was part of a FPU rewrite that went into 5.2[*].  So yep, big
> smoking gun :-)

Since 5.3-rc2, we have three commits fix it.

commitec269475cba7bc (Revert "kvm: x86: Use task structs fpu field for user")
commite751732486eb3 (KVM: X86: Fix fpu state crash in kvm guest)
commitd9a710e5fc4941 (KVM: X86: Dynamically allocate user_fpu)

    Wanpeng



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux