On 2011-07-21 14:45, Gleb Natapov wrote: > On Thu, Jul 21, 2011 at 02:51:18PM +0300, Gleb Natapov wrote: >>>> Jan can you look at this please? >>> >>> I can't promise to do debugging myself. >>> >>> Also, as I never succeeded in getting anything working with CPU hotplug, >>> even back in the days it was supposed to work, I'm a bit clueless /wrt >>> to the right test cases. >>> >> CPU hotplug for Linux suppose to be easy (with allow_hotplug patch >> applied). But we have two bugs currently. One is that ACPI interrupt >> is not send when cpu is onlined (at least this appears to be the case). >> I will look at that one. Another is that after new cpu is detected it >> can't be onlined. >> >> After fixing the first bug the test should look like this: >> 1. start vm with -smp 1,macpus=2 >> 2. wait for it to boot >> 3. do "cpu 1 online" in monitor. >> 4. do "echo 1 > /sys/devices/system/cpu/cpu1/online" >> >> If step 4 should succeed. It fails now. >> > The first one was easy to solve. See patch below. Step 3 should be > "cpu_set 1 online". > > --- > > Trigger sci interrupt after cpu hotplug/unplug event. > > Signed-off-by: Gleb Natapov <gleb@xxxxxxxxxx> > diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c > index c30a050..40f3fcd 100644 > --- a/hw/acpi_piix4.c > +++ b/hw/acpi_piix4.c > @@ -92,7 +92,8 @@ static void pm_update_sci(PIIX4PMState *s) > ACPI_BITMASK_POWER_BUTTON_ENABLE | > ACPI_BITMASK_GLOBAL_LOCK_ENABLE | > ACPI_BITMASK_TIMER_ENABLE)) != 0) || > - (((s->gpe.sts[0] & s->gpe.en[0]) & PIIX4_PCI_HOTPLUG_STATUS) != 0); > + (((s->gpe.sts[0] & s->gpe.en[0]) & > + (PIIX4_PCI_HOTPLUG_STATUS | PIIX4_CPU_HOTPLUG_STATUS)) != 0); > > qemu_set_irq(s->irq, sci_level); > /* schedule a timer interruption if needed */ > -- > Gleb. I had a closer look and identified two further issues, one generic, one CPU-hotplug-specific: - (qdev) devices that are hotplugged do not receive any reset. That does not only apply to the APIC in case of CPU hotplugging, it is also broken for NICs, storage controllers, etc. when doing PCI hot-add as I just checked via gdb. - CPU hotplugging was always (or at least for a fairly long time), well, fragile as it failed to make CPU thread creation and CPU initialization atomic against APIC addition and other initialization steps. IOW, we need to create CPUs stopped, finish all init work, sync their states completely to the kernel (cpu_synchronize_post_init), and then kick them of. Actually I'm considering to stop all CPUs during that short phase to make things simpler and future-proof (when we reduce qemu_global_mutex dependencies). Still, something else must be different for hotplugged CPUs as they fail to come up properly every 2 or 3 system resets or online transitions of the Linux guest. Will try to understand that once time permits. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html