Re: [PATCH RESEND] KVM: X86: Fix load bad host fpu state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



2017-12-13 0:16 GMT+08:00 Paolo Bonzini <pbonzini@xxxxxxxxxx>:
> On 12/12/2017 06:40, Wanpeng Li wrote:
>> 2017-12-12 11:36 GMT+08:00 Peter Xu <peterx@xxxxxxxxxx>:
>>> On Tue, Dec 12, 2017 at 05:51:26AM +0800, Wanpeng Li wrote:
>>>> 2017-12-12 4:48 GMT+08:00 David Hildenbrand <david@xxxxxxxxxx>:
>>>>> On 10.12.2017 22:44, Wanpeng Li wrote:
>>>>>> From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
>>>>>>
>>>>>> ------------[ cut here ]------------
>>>>>>  Bad FPU state detected at kvm_put_guest_fpu+0xd8/0x2d0 [kvm], reinitializing FPU registers.
>>>>>>  WARNING: CPU: 1 PID: 4594 at arch/x86/mm/extable.c:103 ex_handler_fprestore+0x88/0x90
>>>>>>  CPU: 1 PID: 4594 Comm: qemu-system-x86 Tainted: G    B      OE    4.15.0-rc2+ #10
>>>>>>  RIP: 0010:ex_handler_fprestore+0x88/0x90
>>>>>>  Call Trace:
>>>>>>   fixup_exception+0x4e/0x60
>>>>>>   do_general_protection+0xff/0x270
>>>>>>   general_protection+0x22/0x30
>>>>>>  RIP: 0010:kvm_put_guest_fpu+0xd8/0x2d0 [kvm]
>>>>>>  RSP: 0018:ffff8803d5627810 EFLAGS: 00010246
>>>>>>   kvm_vcpu_reset+0x3b4/0x3c0 [kvm]
>>>>>>   kvm_apic_accept_events+0x1c0/0x240 [kvm]
>>>>>>   kvm_arch_vcpu_ioctl_run+0x1658/0x2fb0 [kvm]
>>>>>>   kvm_vcpu_ioctl+0x479/0x880 [kvm]
>>>>>>   do_vfs_ioctl+0x142/0x9a0
>>>>>>   SyS_ioctl+0x74/0x80
>>>>>>   do_syscall_64+0x15f/0x600
>>>>>>
>>>>>> This can be reproduced by running any testcase in kvm-unit-tests since
>>>>>> the qemu userspace FPU context is not initialized, which results in the
>>>>>> init path from kvm_apic_accept_events() will load/put qemu userspace
>>>>>> FPU context w/o initialized. In addition, w/o this splatting we still
>>>>>> should initialize vcpu->arch.user_fpu instead of current->thread.fpu.
>>>>>> This patch fixes it by initializing qemu user FPU context if it is
>>>>>> uninitialized before KVM_RUN.
>>>>>>
>>>>>> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
>>>>>> Cc: Radim Krčmář <rkrcmar@xxxxxxxxxx>
>>>>>> Cc: Rik van Riel <riel@xxxxxxxxxx>
>>>>>> Cc: stable@xxxxxxxxxxxxxxx
>>>>>> Fixes: f775b13eedee (x86,kvm: move qemu/guest FPU switching out to vcpu_run)
>>>>>> Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
>>>>>> ---
>>>>>>  arch/x86/kvm/x86.c | 7 +++++--
>>>>>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>>>> index a92b22f..063a643 100644
>>>>>> --- a/arch/x86/kvm/x86.c
>>>>>> +++ b/arch/x86/kvm/x86.c
>>>>>> @@ -7273,10 +7273,13 @@ static int complete_emulated_mmio(struct kvm_vcpu *vcpu)
>>>>>>
>>>>>>  int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
>>>>>>  {
>>>>>> -     struct fpu *fpu = &current->thread.fpu;
>>>>>> +     struct fpu *fpu = &vcpu->arch.user_fpu;
>>>>>>       int r;
>>>>>>
>>>>>> -     fpu__initialize(fpu);
>>>>>> +     if (!fpu->initialized) {
>>>>>> +             fpstate_init(&fpu->state);
>>>>>> +             fpu->initialized = 1;
>>>>>> +     }
>>>>>
>>>>> Is there a chance of keeping using fpu__initialize() ? Duplicating the
>>>>> code is ugly.
>>>>
>>>> There is a warning in fpu__initialize() which results in just
>>>> current->thread.fpu can take advantage of.
>>>>
>>>>>
>>>>> E.g. can't we simply initialize that in kvm_load_guest_fpu?
>>>>
>>>> We still miss to initialize qemu user FPU context for the above calltrace.
>>>
>>> IMHO we should not really init the user FPU since we should always
>>> load FPU then put FPU.  The problem now is that for vcpus that with
>>> vcpu_id>1 we'll first put the FPU before loading it. So, instead how
>>> about we make sure we load the FPU first even for non-bootstrap vcpus?
>>> And we can actually drop fpu__initialize() call, like:
>>
>> It will introduce extra overhead for all the cases which can't enter
>> into vcpu_run(), I think move
>> fpstate_init(&vcpu->arch.user_fpu.state); to fx_init() is better.
>
> Those cases with a sleeping AP are so rare that they don't matter.  They
> will occur only a few times per boot.  Peter's solution is right, I've
> queued it.

I add a trace_printk here to test Peter's solution:

        if (unlikely(vcpu->arch.mp_state == KVM_MP_STATE_UNINITIALIZED)) {
                if (kvm_run->immediate_exit) {
                        r = -EINTR;
                        goto out;
                }
                kvm_vcpu_block(vcpu);
                kvm_apic_accept_events(vcpu);
                kvm_clear_request(KVM_REQ_UNHALT, vcpu);
                r = -EAGAIN;
                if (signal_pending(current)) {
                        r = -EINTR;
                        vcpu->run->exit_reason = KVM_EXIT_INTR;
                        ++vcpu->stat.signal_exits;
                }
                trace_printk("load/put make no sense\n");
                goto out;
        }

I can observe 92339 times "load/put make no sense" in the log during a
32 vCPUs guest booting on a 32 pCPUs Xeon Skylake server. The print
frequency is as below:

           <...>-207694 [016] .... 1021785.120346:
kvm_arch_vcpu_ioctl_run: load/put make no sense
           <...>-207704 [031] .... 1021785.120347:
kvm_arch_vcpu_ioctl_run: load/put make no sense
           <...>-207710 [005] .... 1021785.120349:
kvm_arch_vcpu_ioctl_run: load/put make no sense
           <...>-207687 [002] .... 1021785.120351:
kvm_arch_vcpu_ioctl_run: load/put make no sense
           <...>-207709 [018] .... 1021785.120353:
kvm_arch_vcpu_ioctl_run: load/put make no sense
           <...>-207687 [002] .... 1021785.120354:
kvm_arch_vcpu_ioctl_run: load/put make no sense
           <...>-207692 [004] .... 1021785.120354:
kvm_arch_vcpu_ioctl_run: load/put make no sense
           <...>-207698 [015] .... 1021785.120358:
kvm_arch_vcpu_ioctl_run: load/put make no sense
           <...>-207702 [029] .... 1021785.120358:
kvm_arch_vcpu_ioctl_run: load/put make no sense
           <...>-207715 [026] .... 1021785.120361:
kvm_arch_vcpu_ioctl_run: load/put make no sense
           <...>-207707 [013] .... 1021785.120362:
kvm_arch_vcpu_ioctl_run: load/put make no sense
           <...>-207712 [022] .... 1021785.120365:
kvm_arch_vcpu_ioctl_run: load/put make no sense
           <...>-207711 [006] .... 1021785.120365:
kvm_arch_vcpu_ioctl_run: load/put make no sense
           <...>-207695 [000] .... 1021785.120367:
kvm_arch_vcpu_ioctl_run: load/put make no sense
           <...>-207689 [027] .... 1021785.120367:
kvm_arch_vcpu_ioctl_run: load/put make no sense
           <...>-207703 [011] .... 1021785.120368:
kvm_arch_vcpu_ioctl_run: load/put make no sense
           <...>-207708 [007] .... 1021785.120370:
kvm_arch_vcpu_ioctl_run: load/put make no sense
           <...>-207692 [004] .... 1021785.120371:
kvm_arch_vcpu_ioctl_run: load/put make no sense
           <...>-207708 [007] .... 1021785.120372:
kvm_arch_vcpu_ioctl_run: load/put make no sense
           <...>-207708 [007] .... 1021785.120373:
kvm_arch_vcpu_ioctl_run: load/put make no sense
           <...>-207698 [015] .... 1021785.120374:
kvm_arch_vcpu_ioctl_run: load/put make no sense
           <...>-207713 [012] .... 1021785.120375:
kvm_arch_vcpu_ioctl_run: load/put make no sense
           <...>-207690 [028] .... 1021785.120376:
kvm_arch_vcpu_ioctl_run: load/put make no sense

I will send out the fx_init() solution, you can pick it if it makes sense. :)

Regards,
Wanpeng Li




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]