On 2011-07-21 13:06, Vasilis Liaskovitis wrote: > Hi, > > On Wed, Jul 20, 2011 at 10:35 AM, Gleb Natapov <gleb@xxxxxxxxxx> wrote: >> On Tue, Jul 19, 2011 at 07:40:55PM +0200, Vasilis Liaskovitis wrote: >>> Hello, >>> >>> I have encountered a problem trying to hotplug a CPU in my x86_64 guest setup. >>> >> You do everything right. It's qemu who is buggy. Since qemu need a patch >> for cpu hotplug to not crash it nobody tests it, so code bit rots. > > thanks for your reply. > > As I mentioned in the original email, onlining a hotplugged-cpu with > qemu-kvm/master results in: > >>> echo 1 > /sys/devices/system/cpu/cpu1/online >>> bash: echo: write error: Input/output error >>> >>> in the guest, dmesg reports: >>> >>> [ 2325.376355] Booting Node 0 Processor 1 APIC 0x1 >>> [ 2325.376357] smpboot cpu 1: start_ip = 9a000 >>> [ 2330.821306] CPU1: Not responding. > > I tried to git-bisect between qemu-kvm-0.13.0 (last known version > where cpu hotplug works correctly > for me) and qemu-kvm/master. > > More precisely: To enable cpu-hotplug at each bisect stage, I apply > this patch derived from: > http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html > > diff --git a/hw/qdev.c b/hw/qdev.c > index 1aa1ea0..aed48ce 100644 > --- a/hw/qdev.c > +++ b/hw/qdev.c > @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void) > if (!main_system_bus) { > main_system_bus = qbus_create(&system_bus_info, NULL, > "main-system-bus"); > + main_system_bus->allow_hotplug = 1; > } > return main_system_bus; > } > > and test cpu hotplug functionality. > The commit that appears to break CPU hotplug is: > > commit f4de8c1451f2265148ff4d895a27e21c0a8788aa > Author: Jan Kiszka <jan.kiszka@xxxxxxxxxxx> > Date: Mon Feb 21 12:28:07 2011 +0100 > qemu-kvm: Mark VCPU state dirty on creation > > Is it possible that kvm_vcpu_dirty should not be set to 1 for a CPU > that's being hot-plugged? > I.e. when kvm_cpu_exec() is called for the first time during > initialization of a hotplugged-CPU, > we shouldn't try to restore state with kvm_arch_put_registers(). We should because user space defines the CPU state on creation or after reset, and that state has to be transferred to the kernel. If you get further by skipping this, we likely create to wrong state in user space while the kernel happens to have a working one. So the state needs fixing, not the write back (IOW, your workaround likely papers over the real issue). So far for a high-level analysis without digging in this dirt on my own. :) Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html