RE: An emulation failure occurs,if I hotplug vcpus immediately after the VM start

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Igor,

Thanks for your response firstly. :)

> -----Original Message-----
> From: Igor Mammedov [mailto:imammedo@xxxxxxxxxx]
> Sent: Friday, June 01, 2018 6:23 PM
> 
> On Fri, 1 Jun 2018 08:17:12 +0000
> xuyandong <xuyandong2@xxxxxxxxxx> wrote:
> 
> > Hi there,
> >
> > I am doing some test on qemu vcpu hotplug and I run into some trouble.
> > An emulation failure occurs and qemu prints the following msg:
> >
> > KVM internal error. Suberror: 1
> > emulation failure
> > EAX=00000000 EBX=00000000 ECX=00000000 EDX=00000600
> > ESI=00000000 EDI=00000000 EBP=00000000 ESP=0000fff8
> > EIP=0000ff53 EFL=00010082 [--S----] CPL=0 II=0 A20=1 SMM=0 HLT=0
> > ES =0000 00000000 0000ffff 00009300
> > CS =f000 000f0000 0000ffff 00009b00
> > SS =0000 00000000 0000ffff 00009300
> > DS =0000 00000000 0000ffff 00009300
> > FS =0000 00000000 0000ffff 00009300
> > GS =0000 00000000 0000ffff 00009300
> > LDT=0000 00000000 0000ffff 00008200
> > TR =0000 00000000 0000ffff 00008b00if
> > GDT=     00000000 0000ffff
> > IDT=     00000000 0000ffff
> > CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
> > DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
> DR3=0000000000000000
> > DR6=00000000ffff0ff0 DR7=0000000000000400
> > EFER=0000000000000000
> > Code=31 d2 eb 04 66 83 ca ff 66 89 d0 66 5b 66 c3 66 89 d0 66 c3 <cf> 66 68
> 21 8a 00 00 e9 08 d7 66 56 66 53 66 83 ec 0c 66 89 c3 66 e8 ce 7b ff ff 66 89 c6
> >
> > I notice that guest is still running SeabBIOS in real mode when the vcpu has
> just been pluged.
> > This emulation failure can be steadly reproduced if I am doing vcpu hotplug
> during VM launch process.
> > After some digging, I find this KVM internal error shows up because KVM
> cannot emulate some MMIO (gpa 0xfff53 ).
> >
> > So I am confused,
> > (1) does qemu support vcpu hotplug even if guest is running seabios ?
> There is no code that forbids it, and I would expect it not to trigger error
> and be NOP.
> 
> > (2) the gpa (0xfff53) is an address of BIOS ROM section, why does kvm
> confirm it as a mmio address incorrectly?
> KVM trace and bios debug log might give more information to guess where to
> look
> or even better would be to debug Seabios and find out what exactly
> goes wrong if you could do it.

This issue can't be reproduced when we opened Seabios debug log or KVM trace. :(

After a few days of debugging, we found that this problem occurs every time when 
the memory region is cleared (memory_size is 0) and the VFIO device is hot-plugged. 

The key function is kvm_set_user_memory_region(), I added some logs in it.

gonglei********: mem.slot: 3, mem.guest_phys_addr=0xc0000, mem.userspace_addr=0x7fc751e00000, mem.flags=0, memory_size=0x20000
gonglei********: mem.slot: 3, mem.guest_phys_addr=0xc0000, mem.userspace_addr=0x7fc751e00000, mem.flags=0, memory_size=0x0
gonglei********: mem.slot: 3, mem.guest_phys_addr=0xc0000, mem.userspace_addr=0x7fc343ec0000, mem.flags=0, memory_size=0x10000
gonglei********: mem.slot: 3, mem.guest_phys_addr=0xc0000, mem.userspace_addr=0x7fc343ec0000, mem.flags=0, memory_size=0x0
gonglei********: mem.slot: 3, mem.guest_phys_addr=0xc0000, mem.userspace_addr=0x7fc343ec0000, mem.flags=0, memory_size=0xbff40000
gonglei********: mem.slot: 3, mem.guest_phys_addr=0xc0000, mem.userspace_addr=0x7fc343ec0000, mem.flags=0, memory_size=0x0
gonglei********: mem.slot: 3, mem.guest_phys_addr=0xc0000, mem.userspace_addr=0x7fc343ec0000, mem.flags=0, memory_size=0xbff40000
gonglei********: mem.slot: 3, mem.guest_phys_addr=0xc0000, mem.userspace_addr=0x7fc343ec0000, mem.flags=0, memory_size=0x0
gonglei********: mem.slot: 3, mem.guest_phys_addr=0xc0000, mem.userspace_addr=0x7fc343ec0000, mem.flags=0, memory_size=0x9000

When the memory region is cleared, the KVM will tell the slot to be
invalid (which it is set to KVM_MEMSLOT_INVALID). 

If SeaBIOS accesses this memory and cause page fault, it will find an invalid value according to 
gfn (by __gfn_to_pfn_memslot), and finally it will return an invalid value, and finally it will return a failure.

The function calls process is as follows in KVM:

kvm_mmu_page_fault
	tdp_page_fault
		try_async_pf
			__gfn_to_pfn_memslot
		__direct_map // return true;
	x86_emulate_instruction
		handle_emulation_failure

The function calls process is as follows in Qemu:

Breakpoint 1, kvm_set_user_memory_region (kml=0x564aa1e2c890, slot=0x564aa1e2d230) at /mnt/sdb/gonglei/qemu/kvm-all.c:261
(gdb) bt
#0  kvm_set_user_memory_region (kml=0x564aa1e2c890, slot=0x564aa1e2d230) at /mnt/sdb/gonglei/qemu/kvm-all.c:261
#1  0x0000564a9e7e3096 in kvm_set_phys_mem (kml=0x564aa1e2c890, section=0x7febeb296500, add=false) at /mnt/sdb/gonglei/qemu/kvm-all.c:887
#2  0x0000564a9e7e34c7 in kvm_region_del (listener=0x564aa1e2c890, section=0x7febeb296500) at /mnt/sdb/gonglei/qemu/kvm-all.c:999
#3  0x0000564a9e7ea884 in address_space_update_topology_pass (as=0x564a9f2b2640 <address_space_memory>, old_view=0x7febdc3449c0, new_view=0x7febdc3443c0, adding=false)
    at /mnt/sdb/gonglei/qemu/memory.c:849
#4  0x0000564a9e7eac49 in address_space_update_topology (as=0x564a9f2b2640 <address_space_memory>) at /mnt/sdb/gonglei/qemu/memory.c:890
#5  0x0000564a9e7ead40 in memory_region_commit () at /mnt/sdb/gonglei/qemu/memory.c:933
#6  0x0000564a9e7eae26 in memory_region_transaction_commit () at /mnt/sdb/gonglei/qemu/memory.c:950
#7  0x0000564a9ea53e05 in i440fx_update_memory_mappings (d=0x564aa2089280) at hw/pci-host/piix.c:155
#8  0x0000564a9ea53ea3 in i440fx_write_config (dev=0x564aa2089280, address=88, val=286330880, len=4) at hw/pci-host/piix.c:168
#9  0x0000564a9ea63ad1 in pci_host_config_write_common (pci_dev=0x564aa2089280, addr=88, limit=256, val=286330880, len=4) at hw/pci/pci_host.c:66
#10 0x0000564a9ea63bf8 in pci_data_write (s=0x564aa2038d70, addr=2147483736, val=286330880, len=4) at hw/pci/pci_host.c:100
#11 0x0000564a9ea63d1a in pci_host_data_write (opaque=0x564aa2036ae0, addr=0, val=286330880, len=4) at hw/pci/pci_host.c:153
#12 0x0000564a9e7e9384 in memory_region_write_accessor (mr=0x564aa2036ee0, addr=0, value=0x7febeb2967c8, size=4, shift=0, mask=4294967295, attrs=...)
    at /mnt/sdb/gonglei/qemu/memory.c:527
#13 0x0000564a9e7e958f in access_with_adjusted_size (addr=0, value=0x7febeb2967c8, size=4, access_size_min=1, access_size_max=4, access=
    0x564a9e7e92a3 <memory_region_write_accessor>, mr=0x564aa2036ee0, attrs=...) at /mnt/sdb/gonglei/qemu/memory.c:593
#14 0x0000564a9e7ebd00 in memory_region_dispatch_write (mr=0x564aa2036ee0, addr=0, data=286330880, size=4, attrs=...) at /mnt/sdb/gonglei/qemu/memory.c:1338
#15 0x0000564a9e798cda in address_space_write_continue (as=0x564a9f2b2520 <address_space_io>, addr=3324, attrs=..., buf=0x7febf8170000 "", len=4, addr1=0, l=4, 
    mr=0x564aa2036ee0) at /mnt/sdb/gonglei/qemu/exec.c:3167
#16 0x0000564a9e798e99 in address_space_write (as=0x564a9f2b2520 <address_space_io>, addr=3324, attrs=..., buf=0x7febf8170000 "", len=4)
    at /mnt/sdb/gonglei/qemu/exec.c:3224
#17 0x0000564a9e7991c7 in address_space_rw (as=0x564a9f2b2520 <address_space_io>, addr=3324, attrs=..., buf=0x7febf8170000 "", len=4, is_write=true)
    at /mnt/sdb/gonglei/qemu/exec.c:3326
#18 0x0000564a9e7e55f6 in kvm_handle_io (port=3324, attrs=..., data=0x7febf8170000, direction=1, size=4, count=1) at /mnt/sdb/gonglei/qemu/kvm-all.c:1995
#19 0x0000564a9e7e5c35 in kvm_cpu_exec (cpu=0x564aa1e63580) at /mnt/sdb/gonglei/qemu/kvm-all.c:2189
#20 0x0000564a9e7ccc1a in qemu_kvm_cpu_thread_fn (arg=0x564aa1e63580) at /mnt/sdb/gonglei/qemu/cpus.c:1078
#21 0x0000564a9ec4afcb in qemu_thread_start (args=0x564aa1e8b490) at util/qemu-thread-posix.c:496
#22 0x00007febf4efedc5 in start_thread () from /lib64/libpthread.so.0
#23 0x00007febf1ce079d in clone () from /lib64/libc.so.6

So, My questions are:

1) Why don't we hold kvm->slots_lock during page fault processing?

2) How do we assure that vcpus will not access the corresponding
  region when deleting an memory slot?


Thanks,
-Gonglei



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux