Hi Marc and Christoffer, Below are the steps I took and the complete crash dump. 1. start the host with all cpus in hyp mode. 2. start the guest os. 3. offline and hotplug the all of the secondary cpus. 4. verify that the guest os is still alive and start one more guest os. 5. halt the first guest os. 6. quit qemu process. The crash happens now. [ 123.700000] Unable to handle kernel NULL pointer dereference at virtual address 00000000 [ 123.700000] pgd = c0003000 [ 123.700000] [00000000] *pgd=80000080004003, *pmd=00000000 [ 123.710000] Internal error: Oops: 207 [#1] PREEMPT SMP ARM [ 123.710000] CPU: 1 Not tainted (3.8.0-rc7-00196-g063f56c-dirty #269) [ 123.720000] PC is at unmap_range+0x9c/0x2f4 [ 123.720000] LR is at kvm_free_stage2_pgd+0x30/0x4c [ 123.730000] pc : [<c00145b0>] lr : [<c0014c2c>] psr: 80000013 [ 123.730000] sp : eeb53e60 ip : 00000000 fp : ee80c000 [ 123.740000] r10: ee40e808 r9 : 00000000 r8 : 00000000 [ 123.750000] r7 : ae1db003 r6 : c0000000 r5 : ee80c000 r4 : 00000000 [ 123.750000] r3 : 00000000 r2 : ae1db003 r1 : 00000000 r0 : 00000000 [ 123.760000] Flags: Nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user [ 123.770000] Control: 30c5387d Table: ae9a5400 DAC: 55555555 [ 123.770000] Process qemu-system-arm (pid: 2678, stack limit = 0xeeb52238) [ 123.780000] Stack: (0xeeb53e60 to 0xeeb54000) [ 123.780000] 3e60: eeb53e84 00040003 c051b5c8 c051b5c0 00000001 000c0000 c0000000 00000000 [ 123.790000] 3e80: c0afc7e0 00000000 00000100 ee40e800 00000000 ee9a5e00 00000002 ee40e808 [ 123.800000] 3ea0: 00000001 c0014c2c 00000000 00000100 ee40e800 ee8df500 eef63c78 c00129c8 [ 123.810000] 3ec0: ee40e800 ee8df500 eef63c78 c000eb6c ee706780 eec4a330 eef63c78 00000000 [ 123.820000] 3ee0: 00000008 ef2c5310 ee706788 c000f068 c000f058 c00bebcc 00000000 00000000 [ 123.820000] 3f00: ef336854 ef2ad000 ef336580 c0513644 c0019b28 eeb52000 00000000 c0044810 [ 123.830000] 3f20: ef336580 26212621 ee8df500 ef336580 ef336864 ee8df500 ee8df548 c0030cb8 [ 123.840000] 3f40: 00000001 ef336580 eeb52000 00000000 eeb53f64 26212621 ef3670c0 ef367218 [ 123.850000] 3f60: 00000001 ee6ab600 00000000 eeb52000 ee5f94c4 c0019b28 eeb52000 00000000 [ 123.860000] 3f80: 00000001 c003132c 00000000 000703c2 b6d56760 b6d56760 000000f8 c00313a4 [ 123.860000] 3fa0: 00000000 c0019980 000703c2 b6d56760 00000000 000703ae b6c3f4c0 00000000 [ 123.870000] 3fc0: 000703c2 b6d56760 b6d56760 000000f8 00251804 00000001 be9773f9 00000001 [ 123.880000] 3fe0: 000000f8 be97734c b6ce7ce3 b6c8f1e6 600f0030 00000000 ffffffff ffffffff [ 123.890000] [<c00145b0>] (unmap_range+0x9c/0x2f4) from [<c0014c2c>] (kvm_free_stage2_pgd+0x30/0x4c) [ 123.900000] [<c0014c2c>] (kvm_free_stage2_pgd+0x30/0x4c) from [<c00129c8>] (kvm_arch_destroy_vm+0xc/0x38) [ 123.910000] [<c00129c8>] (kvm_arch_destroy_vm+0xc/0x38) from [<c000eb6c>] (kvm_put_kvm+0xec/0x150) [ 123.920000] [<c000eb6c>] (kvm_put_kvm+0xec/0x150) from [<c000f068>] (kvm_vcpu_release+0x10/0x18) [ 123.930000] [<c000f068>] (kvm_vcpu_release+0x10/0x18) from [<c00bebcc>] (__fput+0x88/0x1dc) [ 123.930000] [<c00bebcc>] (__fput+0x88/0x1dc) from [<c0044810>] (task_work_run+0xac/0xe8) [ 123.940000] [<c0044810>] (task_work_run+0xac/0xe8) from [<c0030cb8>] (do_exit+0x22c/0x82c) [ 123.950000] [<c0030cb8>] (do_exit+0x22c/0x82c) from [<c003132c>] (do_group_exit+0x48/0xb0) [ 123.960000] [<c003132c>] (do_group_exit+0x48/0xb0) from [<c00313a4>] (__wake_up_parent+0x0/0x18) [ 123.970000] Code: e1927003 0afffff0 e7e80658 e3a0c000 (e1cc20d0) [ 123.970000] ---[ end trace 8f0d0eaefb305781 ]--- [ 123.980000] Fixing recursive fault but reboot is needed! Thanks, Giridhar On 04/19/2013 12:08 AM, Christoffer Dall wrote: On Thu, Apr 18, 2013 at 7:40 AM, Marc Zyngier <marc.zyngier@xxxxxxx> wrote: On 18/04/13 15:16, Giridhar Maruthy wrote: Hi Giridhar, Thanks a lot for pointing me at the series. I did apply the series and got cpu hotplug to work successfully. Ah, good to know. Thanks for testing. However, I have the following doubts. 1. Though the guest does not crash, when exiting the qemu, I get the following crash dump. I have not yet looked into the details. I haven't been able to reproduce this. Can you tell us the exact steps you take to reproduce? [ 547.870000] [<c00145b0>] (unmap_range+0x9c/0x2f4) from [<c0014c2c>] (kvm_free_stage2_pgd+0x30/0x4c) [ 547.880000] [<c0014c2c>] (kvm_free_stage2_pgd+0x30/0x4c) from [<c00129c8>] (kvm_arch_destroy_vm+0xc/0x38) [ 547.890000] [<c00129c8>] (kvm_arch_destroy_vm+0xc/0x38) from [<c000eb6c>] (kvm_put_kvm+0xec/0x150) [ 547.900000] [<c000eb6c>] (kvm_put_kvm+0xec/0x150) from [<c000f068>] (kvm_vcpu_release+0x10/0x18) [ 547.910000] [<c000f068>] (kvm_vcpu_release+0x10/0x18) from [<c00bebcc>] (__fput+0x88/0x1dc) [ 547.920000] [<c00bebcc>] (__fput+0x88/0x1dc) from [<c0044810>] (task_work_run+0xac/0xe8) [ 547.920000] [<c0044810>] (task_work_run+0xac/0xe8) from [<c0030cb8>] (do_exit+0x22c/0x82c) [ 547.930000] [<c0030cb8>] (do_exit+0x22c/0x82c) from [<c003132c>] (do_group_exit+0x48/0xb0) [ 547.940000] [<c003132c>] (do_group_exit+0x48/0xb0) from [<c003b618>] (get_signal_to_deliver+0x278/0x504) [ 547.950000] [<c003b618>] (get_signal_to_deliver+0x278/0x504) from [<c001c8e4>] (do_signal+0x74/0x460) [ 547.960000] [<c001c8e4>] (do_signal+0x74/0x460) from [<c001d150>] (do_work_pending+0x64/0xac) [ 547.970000] [<c001d150>] (do_work_pending+0x64/0xac) from [<c00199c0>] (work_pending+0xc/0x20) [ 547.980000] Code: e1927003 0afffff0 e7e80658 e3a0c000 (e1cc20d0) [ 547.980000] ---[ end trace 05d3020cd57fa289 ]--- [ 547.990000] Fixing recursive fault but reboot is needed! It probably means we're having issues with the Stage-2 page refcounts. Can you share the whole dump (I think there's a few additional lines before what you quoted)? 2. I applied kvm-arm-fixes branch from Christoffer's tree (github.com/virtualopensystems/linux-kvm-arm <http://github.com/virtualopensystems/linux-kvm-arm>) and then applied the v4 series of "ARM: KVM: Revamping the HYP init code for fun and profit". I ran into some merge conflicts. So I manually edited and applied the patches. Should I be including any more dependant patches? You'd be better of using the following branch: git://github.com/columbia/linux-kvm-arm.git kvm-arm-for-next as it should contain all you need. I haven't tested it yet, though. so I just tried this on vexpress TC2, and when I hotplug cpu1, I get the crash below. Is this actually supposed to work at this point?: Kernel panic - not syncing: unexpected prefetch abort in Hyp mode at: 0x803c1880unexpected data abort in Hyp mode at: 0x0 [<800208f4>] (unwind_backtrace+0x0/0xf8) from [<803bb360>] (panic+0x90/0x1e4) [<803bb360>] (panic+0x90/0x1e4) from [<80012b48>] (cpu_init_hyp_mode+0x10/0x6c) [<80012b48>] (cpu_init_hyp_mode+0x10/0x6c) from [<80012bc8>] (hyp_init_cpu_notify+0x24/0x2c) [<80012bc8>] (hyp_init_cpu_notify+0x24/0x2c) from [<8004b900>] (notifier_call_chain+0x44/0x84) [<8004b900>] (notifier_call_chain+0x44/0x84) from [<8002ebf8>] (__cpu_notify+0x28/0x44) [<8002ebf8>] (__cpu_notify+0x28/0x44) from [<803b8d20>] (secondary_start_kernel+0xd4/0x11c) [<803b8d20>] (secondary_start_kernel+0xd4/0x11c) from [<803b6dec>] (vexpress_cpu_die+0xc/0xa0) CPU0: stopping [<800208f4>] (unwind_backtrace+0x0/0xf8) from [<8001f078>] (handle_IPI+0xfc/0x130) [<8001f078>] (handle_IPI+0xfc/0x130) from [<800085c4>] (gic_handle_irq+0x54/0x5c) [<800085c4>] (gic_handle_irq+0x54/0x5c) from [<80019f00>] (__irq_svc+0x40/0x50) Exception stack(0x8052bf60 to 0x8052bfa8) bf60: 0000001f 805323ec 00000000 00000000 8052a000 80554948 8052a000 80554948 bf80: 8052a000 412fc0f1 803c4a2c 00000000 00000000 8052bfa8 8001b584 8001b564 bfa0: 600f0013 ffffffff [<80019f00>] (__irq_svc+0x40/0x50) from [<8001b564>] (cpu_idle+0xa0/0xec) [<8001b564>] (cpu_idle+0xa0/0xec) from [<804f67ac>] (start_kernel+0x29c/0x2ec) -Christoffer _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm