Re: [Qemu-devel] E5-2620v2 - emulation stop error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 11, 2015 at 03:53:07PM +0000, Dr. David Alan Gilbert wrote:
> * Kevin O'Connor (kevin@xxxxxxxxxxxx) wrote:
> > On Wed, Mar 11, 2015 at 01:45:57PM +0000, Dr. David Alan Gilbert wrote:
> > > * Bandan Das (bsd@xxxxxxxxxx) wrote:
> > > > "Dr. David Alan Gilbert" <dgilbert@xxxxxxxxxx> writes:
> > > > > while true; do (sleep 5; echo -e '\001cq\n')|/opt/qemu-try-world3/bin/qemu-system-x86_64 -machine pc-i440fx-2.0,accel=kvm -m 1024 -smp 128 -nographic -device sga 2>&1 | tee /tmp/qemu.op; grep "internal error" /tmp/qemu.op -q && break; done

That is a truly impressive command line, BTW.

> > > > > [root@virtlab413 qemu-world3]# git bisect bad
> > > > > 21f5826a04d38e19488f917e1eef22751490c769 is the first bad commit
> > > > 
> > > > I can reproduce this on E5-2620 v2 with  David's "while true" test.
> > > > (The emulation failure I mean, not the suberror 2 that Andrey is seeing)
> > > > The commit that seems to have introduced this is -
> > > > 
> > > > commit 0673b7870063a3affbad9046fb6d385a4e734c19
> > > > Author: Kevin O'Connor <kevin@xxxxxxxxxxxx>
> > > > Date:   Sat May 24 10:49:50 2014 -0400
> > > > 
> > > >     smp: Replace QEMU SMP init assembler code with C; run only in 32bit mode.
> > [...]
> > > Turning on debug logging
> > > ( -chardev file,id=log,path=/tmp/debugcon.$$ -device isa-debugcon,chardev=log,iobase=0x402 )
> > > 
> > > SeaBIOS (version rel-1.8.0-0-g4c59f5d-20150219_092859-nilsson.home.kraxel.org)
> > [...]
> > > Found 1 cpu(s) max supported 128 cpu(s)
> > 
> > Something is very odd here.  When I run the above command (on an older
> > AMD machine) I get:
> > 
> > Found 128 cpu(s) max supported 128 cpu(s)
> > 
> > That first value (1 vs 128) comes from QEMU (via cmos index 0x5f).
> > That is, during smp init, SeaBIOS expects QEMU to tell it how many
> > cpus are active, and SeaBIOS waits until that many CPUs check in from
> > its SIPI request before proceeding.
> > 
> > I wonder if QEMU reported only 1 active cpu via that cmos register,
> > but more were actually active.  If that was the case, it could
> > certainly explain the failure - as multiple cpus could be running
> > without the sipi trapoline in place.
> > 
> > What does the log look like on a non-failure case?
> 
> I had to drop down from 128 to get a working run with debug; here
> are two runs with -smp 20   the first one worked, the second one
> failed.
[...]
> =========== Working ===========
> 
> SeaBIOS (version rel-1.8.0-0-g4c59f5d-20150219_092859-nilsson.home.kraxel.org)
[...]
> Found 20 cpu(s) max supported 20 cpu(s)
[...]
> =========== Broken ===========
> 
> SeaBIOS (version rel-1.8.0-0-g4c59f5d-20150219_092859-nilsson.home.kraxel.org)
[...]
> Found 1 cpu(s) max supported 20 cpu(s)

So, I couldn't get this to fail on my older AMD machine at all with
the default SeaBIOS code.  But, when I change the code with the patch
below, it failed right away.

KVM internal error. Suberror: 1
emulation failure
EAX=00000000 EBX=00000000 ECX=00000000 EDX=000fd2b8
ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
EIP=000fd2c1 EFL=00000007 [-----PC] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
SS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
FS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
GS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008300 DPL=0 TSS16-busy
GDT=     000f6a50 00000037
IDT=     000f6a8e 00000000
CR0=60000011 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=66 ba b8 d2 0f 00 e9 a2 fe f3 90 f0 0f ba 2d 04 ff fb 3f 00 <72> f3 8b 25 00 ff fb 3f e8 d2 65 ff ff c7 05 04 ff fb 3f 00 00 00 00 f4 eb fd fa fc 66 b8

And the failed debug output looks like:

SeaBIOS (version rel-1.8.0-7-gd23eba6-dirty-20150311_121819-morn.localdomain)
[...]
cmos_smp_count0=20
[...]
cmos_smp_count=1
cmos_smp_count2=1/20
Found 1 cpu(s) max supported 20 cpu(s)

I'm going to check the assembly for a compiler error, but is it
possible QEMU is returning incorrect data in cmos index 0x5f?

David, any chance you can recompile seabios and double check your
output?

-Kevin


--- a/src/fw/smp.c
+++ b/src/fw/smp.c
@@ -128,6 +128,7 @@ smp_setup(void)
 
     // Wait for other CPUs to process the SIPI.
     u8 cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1;
+    dprintf(1, "cmos_smp_count=%d\n", cmos_smp_count);
     while (cmos_smp_count != CountCPUs)
         asm volatile(
             // Release lock and allow other processors to use the stack.
@@ -140,6 +141,8 @@ smp_setup(void)
             : "+m" (SMPLock), "+m" (SMPStack)
             : : "cc", "memory");
     yield();
+    dprintf(1, "cmos_smp_count2=%d/%d\n", cmos_smp_count
+            , rtc_read(CMOS_BIOS_SMP_COUNT) + 1);
 
     // Restore memory.
     *(u64*)BUILD_AP_BOOT_ADDR = old;
diff --git a/src/post.c b/src/post.c
index 9ea5620..dc11c72 100644
--- a/src/post.c
+++ b/src/post.c
@@ -170,6 +170,7 @@ platform_hardware_setup(void)
     clock_setup();
 
     // Platform specific setup
+    dprintf(1, "cmos_smp_count0=%d\n", rtc_read(CMOS_BIOS_SMP_COUNT) + 1);
     qemu_platform_setup();
     coreboot_platform_setup();
 }
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux