On Fri, Jul 22, 2011 at 12:56:58PM +0200, Jan Kiszka wrote: > On 2011-07-21 14:45, Gleb Natapov wrote: > > On Thu, Jul 21, 2011 at 02:51:18PM +0300, Gleb Natapov wrote: > >>>> Jan can you look at this please? > >>> > >>> I can't promise to do debugging myself. > >>> > >>> Also, as I never succeeded in getting anything working with CPU hotplug, > >>> even back in the days it was supposed to work, I'm a bit clueless /wrt > >>> to the right test cases. > >>> > >> CPU hotplug for Linux suppose to be easy (with allow_hotplug patch > >> applied). But we have two bugs currently. One is that ACPI interrupt > >> is not send when cpu is onlined (at least this appears to be the case). > >> I will look at that one. Another is that after new cpu is detected it > >> can't be onlined. > >> > >> After fixing the first bug the test should look like this: > >> 1. start vm with -smp 1,macpus=2 > >> 2. wait for it to boot > >> 3. do "cpu 1 online" in monitor. > >> 4. do "echo 1 > /sys/devices/system/cpu/cpu1/online" > >> > >> If step 4 should succeed. It fails now. > >> > > The first one was easy to solve. See patch below. Step 3 should be > > "cpu_set 1 online". > > > > --- > > > > Trigger sci interrupt after cpu hotplug/unplug event. > > > > Signed-off-by: Gleb Natapov <gleb@xxxxxxxxxx> > > diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c > > index c30a050..40f3fcd 100644 > > --- a/hw/acpi_piix4.c > > +++ b/hw/acpi_piix4.c > > @@ -92,7 +92,8 @@ static void pm_update_sci(PIIX4PMState *s) > > ACPI_BITMASK_POWER_BUTTON_ENABLE | > > ACPI_BITMASK_GLOBAL_LOCK_ENABLE | > > ACPI_BITMASK_TIMER_ENABLE)) != 0) || > > - (((s->gpe.sts[0] & s->gpe.en[0]) & PIIX4_PCI_HOTPLUG_STATUS) != 0); > > + (((s->gpe.sts[0] & s->gpe.en[0]) & > > + (PIIX4_PCI_HOTPLUG_STATUS | PIIX4_CPU_HOTPLUG_STATUS)) != 0); > > > > qemu_set_irq(s->irq, sci_level); > > /* schedule a timer interruption if needed */ > > -- > > Gleb. > > I had a closer look and identified two further issues, one generic, one > CPU-hotplug-specific: > - (qdev) devices that are hotplugged do not receive any reset. That > does not only apply to the APIC in case of CPU hotplugging, it is > also broken for NICs, storage controllers, etc. when doing PCI > hot-add as I just checked via gdb. > - CPU hotplugging was always (or at least for a fairly long time), > well, fragile as it failed to make CPU thread creation and CPU > initialization atomic against APIC addition and other initialization > steps. IOW, we need to create CPUs stopped, finish all init work, > sync their states completely to the kernel > (cpu_synchronize_post_init), and then kick them of. Actually I'm Syncing the state to the kernel should be done by vcpu thread, so I it cannot be stopped while the sync is done. May be I misunderstood what you mean here. > considering to stop all CPUs during that short phase to make things > simpler and future-proof (when we reduce qemu_global_mutex > dependencies). > > Still, something else must be different for hotplugged CPUs as they fail > to come up properly every 2 or 3 system resets or online transitions of > the Linux guest. Will try to understand that once time permits. > > Jan > > -- > Siemens AG, Corporate Technology, CT T DE IT 1 > Corporate Competence Center Embedded Linux -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html