"Kevin O'Connor" <kevin@xxxxxxxxxxxx> writes: > On Wed, Mar 11, 2015 at 01:09:42PM -0400, Bandan Das wrote: >> "Kevin O'Connor" <kevin@xxxxxxxxxxxx> writes: >> ... >> > >> > Something is very odd here. When I run the above command (on an older >> > AMD machine) I get: >> > >> > Found 128 cpu(s) max supported 128 cpu(s) >> > >> > That first value (1 vs 128) comes from QEMU (via cmos index 0x5f). >> > That is, during smp init, SeaBIOS expects QEMU to tell it how many >> > cpus are active, and SeaBIOS waits until that many CPUs check in from >> > its SIPI request before proceeding. >> > >> > I wonder if QEMU reported only 1 active cpu via that cmos register, >> > but more were actually active. If that was the case, it could >> >> I was daring enough to try this and I don't see the crash :) >> >> diff --git a/src/fw/smp.c b/src/fw/smp.c >> index a466ea6..a346d46 100644 >> --- a/src/fw/smp.c >> +++ b/src/fw/smp.c >> @@ -49,6 +49,7 @@ int apic_id_is_present(u8 apic_id) >> void VISIBLE32FLAT >> handle_smp(void) >> { >> + dprintf(DEBUG_HDL_smp, "Calling handle_smp\n"); >> if (!CONFIG_QEMU) >> return; >> >> @@ -128,6 +129,8 @@ smp_setup(void) >> >> // Wait for other CPUs to process the SIPI. >> u8 cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1; >> + while (cmos_smp_count == 1) >> + cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1; > > That would loop forever if you had "-smp 1". Sorry, I should have been more clear. What I meant is if I run with "-smp 128" (from Dave's original reproducer), sticking this while loop avoids the crash. So, the rtc_read eventually returns the right number (127), as the above while loop keeps polling. Bandan >> while (cmos_smp_count != CountCPUs) >> asm volatile( >> // Release lock and allow other processors to use the stack. >> >> So, the while loop results in a race somehow ? > > No, the problem is that loop doesn't run at all, and as a result the > other cpus end up running random code. SeaBIOS sets up an entry > vector for multiple cpus, wakes up the cpus, then tears down the entry > vector. If it tears down the entry vector before all the cpus have > run through it, then the other cpus can end up running random code. > > -Kevin -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html