Hi Phil, Most of what you said went right over my head. It sounds like processor related information may be useful though: MachineA works, MachineB does not. [root@MachineA f10copy]# egrep 'cpu fam|model|flags' /proc/cpuinfo cpu family : 15 model : 33 model name : Dual Core AMD Opteron(tm) Processor 275 flags : fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm cmp_legacy ... [root@MachineB ~]# egrep 'cpu fam|model|flags' /proc/cpuinfo cpu family : 15 model : 4 model name : Intel(R) Xeon(TM) CPU 2.80GHz flags : fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lm constant_tsc pni monitor ds_cpl est cid cx16 xtpr lahf_lm ... I was unable to simulate the crash on MachineA, but I created the same vm with the following command: xm create -p -c f10copy.config And ran xenctx against the freshly created domain: /usr/lib64/xen/bin/xenctx -s /tmp/System.map-2.6.27.5-117.fc10.x86_64 39 rip: ffffffff810093aa _stext+0x3aa rsp: ffffffff81573ea0 rax: 00000000 rbx: ffffffff81572000 rcx: ffffffff810093aa rdx: 00000000 rsi: 00000000 rdi: 00000001 rbp: ffffffff81573eb8 r8: 00000000 r9: ffff8800020b2348 r10: 00000001 r11: 00000246 r12: 6db6db6db6db6db7 r13: ffffffff815d2660 r14: ffffffff815d4cc0 r15: 00000016 cs: 0000e033 ds: 00000000 fs: 00000000 gs: 00000000 Stack: ffffffff81766a78 0000000000000003 ffffffff8100a709 ffffffff81573ed8 ffffffff8100ba06 ffffffff81573ed8 ffffffff816d60e8 ffffffff81573ef8 ffffffff8100f279 ffffffff815d4cc0 0000000000000000 ffffffff81573f08 ffffffff8131ed7d ffffffff81573f48 ffffffff8159dd46 ffffffff81573f48 Code: cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc Call Trace: [<ffffffff810093aa>] _stext+0x3aa <-- [<ffffffff8100a709>] xen_safe_halt+0x10 [<ffffffff8100ba06>] xen_idle+0x55 [<ffffffff8100f279>] cpu_idle+0xb2 [<ffffffff8131ed7d>] rest_init+0x61 [<ffffffff8159dd46>] start_kernel+0x39f [<ffffffff8159d2ba>] x86_64_start_reservations+0xa5 [<ffffffff815a3e64>] xen_start_kernel+0x7e1 -----Original Message----- From: Virtualization [mailto:virtualization@xxxxxxxxxxxxxxxx] Sent: Tuesday, January 20, 2009 3:19 PM To: Jon Swanson Cc: fedora-xen@xxxxxxxxxx Subject: RE: f10 x86_64 xen VM guests fail to boot on f8 host Hi list, >From the Intel(r) Virtualization Technology Specification for the IA-32 Intel(r) Architecture (2005): "2.9.2 Information for VM Exits Due to Vectored Events Event-specific information is provided for VM exits due to the following vectored events: exceptions (including those generated by the instructions INT3, INTO, BOUND, and UD2); external interrupts that occur while the "acknowledge interrupt on exit" VM-exit control is 1; and non-maskable interrupts (NMIs). This information is provided in the following fields:" .... The <0f> 0b in the "Code:" section are the UD2 instruction. Checking through the OpCode map for the Xeon processor, this is an invalid op code. In VT processors the software guide indicates that a program can communicate various events and state information to the underlying virtualization supervisor by executing a UD2 (and some others ops like it). I think that in a non-VT cpu it's actually a "real" invalid op code. The stuff (hardware) which flips over to the supervisor with all the needed info from the virtual machine isn't there. KVM uses this, from the patches I've seen Googling around for UD2 (if I understand correctly). So why a UD2 in the code? It's highly unlikely that it's just some random bytes that happen to be a UD2. Possibly the kernel thinks it's in fully virt mode at some point? The image notes do seem to indicate this. Cheers Phill. On Tue, 2009-01-20 at 14:01 +0900, Jon Swanson wrote: > Hi Mark, thank you very much for your help. > > I took an f10host which boots on MachineA and copied it to MachineB, > modified the config to include on_crash=preserve, and booted it with > xm create. > ---------------------------------------------------------------------- > -- > ------------------------------------------------ > xenctx output: > /usr/lib64/xen/bin/xenctx -s System.map-2.6.27.5-117.fc10.x86_64 46 > rip: ffffffff8100b8a2 set_page_prot+0x6d > rsp: ffffffff81573f08 > rax: ffffffea rbx: 000016e1 rcx: 00000055 rdx: 00000000 > rsi: 800000014ffc6061 rdi: ffffffff816e1000 rbp: ffffffff81573f68 > r8: 0000000f r9: ffffffff817eb450 r10: ffffffff817eb650 r11: > 00000010 > r12: ffffffff816e1000 r13: 800000014ffc6061 r14: 8000000000000161 > r15: 00000016 > cs: 0000e033 ds: 00000000 fs: 00000000 gs: 00000000 > > Stack: > 0000000000000055 0000000000000010 ffffffff8100b8a2 000000010000e030 > 0000000000010082 ffffffff81573f48 000000000000e02b ffffffff8100b89e > 0000000000000200 ffffffff816e4000 0000000000000800 0000000000002c00 > ffffffff81573ff8 ffffffff815a3c60 0000000000002c00 0000000000000000 > > Code: > 7b 4a 1d 00 4c 89 e7 4c 89 ee 31 d2 e8 22 d9 ff ff 85 c0 74 04 <0f> 0b > eb fe 5b 41 5c 41 5d 41 5e > > Call Trace: > [<ffffffff8100b8a2>] set_page_prot+0x6d <-- > [<ffffffff8100b8a2>] set_page_prot+0x6d > [<ffffffff8100b89e>] set_page_prot+0x69 > [<ffffffff815a3c60>] xen_start_kernel+0x5dd > > > ---------------------------------------------------------------------- > -- > ------------------------------------------------ > Dmesg also has something which may make sense to someone wiser than > myself. Specifically: > (XEN) traps.c:405:d46 Unhandled invalid opcode fault/trap [#6] in > domain > 46 on VCPU 0 [ec=0000] > > > xm dmesg > .... > (XEN) ffffffff82035000 ffffffff82036000 ffffffff82037000 > ffffffff82038000 > (XEN) mm.c:1362:d46 Bad L1 flags 800000 > (XEN) traps.c:405:d46 Unhandled invalid opcode fault/trap [#6] in > domain > 46 on VCPU 0 [ec=0000] > (XEN) domain_crash_sync called from entry.S > (XEN) Domain 46 (vcpu#0) crashed on cpu#2: > (XEN) ----[ Xen-3.1.4 x86_64 debug=n Not tainted ]---- > (XEN) CPU: 2 > (XEN) RIP: e033:[<ffffffff8100b8a2>] > (XEN) RFLAGS: 0000000000000282 CONTEXT: guest > (XEN) rax: 00000000ffffffea rbx: 00000000000016e1 rcx: > 0000000000000055 > (XEN) rdx: 0000000000000000 rsi: 800000014ffc6061 rdi: > ffffffff816e1000 > (XEN) rbp: ffffffff81573f68 rsp: ffffffff81573f08 r8: > 000000000000000f > (XEN) r9: ffffffff817eb450 r10: ffffffff817eb650 r11: > 0000000000000010 > (XEN) r12: ffffffff816e1000 r13: 800000014ffc6061 r14: > 8000000000000161 > (XEN) r15: 0000000000000016 cr0: 000000008005003b cr4: > 00000000000006f0 > (XEN) cr3: 0000000144f18000 cr2: 0000000000000000 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033 > (XEN) Guest stack trace from rsp=ffffffff81573f08: > (XEN) 0000000000000055 0000000000000010 ffffffff8100b8a2 > 000000010000e030 > (XEN) 0000000000010082 ffffffff81573f48 000000000000e02b > ffffffff8100b89e > (XEN) 0000000000000200 ffffffff816e4000 0000000000000800 > 0000000000002c00 > (XEN) ffffffff81573ff8 ffffffff815a3c60 0000000000002c00 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > ffffffff8208b000 > (XEN) 0000000000010000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > ffffffff82008000 > (XEN) ffffffff82009000 ffffffff8200a000 ffffffff8200b000 > ffffffff8200c000 > (XEN) ffffffff8200d000 ffffffff8200e000 ffffffff8200f000 > ffffffff82010000 > (XEN) ffffffff82011000 ffffffff82012000 ffffffff82013000 > ffffffff82014000 > (XEN) ffffffff82015000 ffffffff82016000 ffffffff82017000 > ffffffff82018000 > (XEN) ffffffff82019000 ffffffff8201a000 ffffffff8201b000 > ffffffff8201c000 > (XEN) ffffffff8201d000 ffffffff8201e000 ffffffff8201f000 > ffffffff82020000 > (XEN) ffffffff82021000 ffffffff82022000 ffffffff82023000 > ffffffff82024000 > (XEN) ffffffff82025000 ffffffff82026000 ffffffff82027000 > ffffffff82028000 > (XEN) ffffffff82029000 ffffffff8202a000 ffffffff8202b000 > ffffffff8202c000 > (XEN) ffffffff8202d000 ffffffff8202e000 ffffffff8202f000 > ffffffff82030000 > (XEN) ffffffff82031000 ffffffff82032000 ffffffff82033000 > ffffffff82034000 > (XEN) ffffffff82035000 ffffffff82036000 ffffffff82037000 > ffffffff82038000 > ... > > ---------------------------------------------------------------------- > -- > ------------------------------------------------ > > > -----Original Message----- > From: Mark McLoughlin [mailto:markmc@xxxxxxxxxx] > Sent: Friday, January 16, 2009 7:34 PM > To: Jon Swanson > Cc: fedora-xen@xxxxxxxxxx > Subject: Re: f10 x86_64 xen VM guests fail to boot on f8 > host > > Hi Jon, > > On Fri, 2009-01-16 at 18:02 +0900, Jon Swanson wrote: > > This is a cross post of the same subject on the Fedora Forums. If > > this is bad practice let me know and i'll never do it again. > > Mailing lists can often be a better way to get help from developers, > so posting here is no problem. > > Also, fedora-virt@xxxxxxxxxx might be a better place to post questions > these days - it's not clear whether the fedora-xen list has a future. > > > Additional log info is available at > > http://forums.fedoraforum.org/showthread.php?p=1149972&posted=1#post > > 11 > > 49 > > 972 > > > > I have two machines running fresh installs of f8 with the xen. > > Kernel and all software versions are the same on both. > > You've seen this then, right? > > > http://fedoraproject.org/wiki/Bugs/F10Common#Installing_Fedora_10_DomU > _o > n_Fedora_8_Dom0_Fails > > > /var/log/xen/xend.log relevant output: > > > > [2009-01-16 14:45:32 4120] DEBUG (DevController:150) Waiting for > > devices vtpm. > > [2009-01-16 14:45:32 4120] INFO (XendDomain:1130) Domain f10testB > > (21) > > > unpaused. > > [2009-01-16 14:45:32 4120] WARNING (XendDomainInfo:1203) Domain has > > crashed: name=f10testB id=21. > > [2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1802) > > XendDomainInfo.destroy: domid=21 > > [2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1821) > > XendDomainInfo.destroyDomain(21) > > I've also tried moving a functional guest from MachineA to MachineB > > to > > > boot it there, with the same results. Guest will not boot on MachineB. > > > > f8 64bit guests will boot on MachineB with no problems. > > f10 32bit guests will boot on MachineB with no problems. > > > > Only 64bit machines seem to be borked. > > Okay, sounds like it might "just" be a F10 kernel bug. > > Try doing this to get a stack trace: > > 1) Set "on_crash=preserve" in your domain config > > 2) Copy the guest kernel's System.map to the host > > 2) Once the guest has crashed, run: > > /usr/lib/xen/bin/xenctx -s System.map <domid> > > Cheers, > Mark. > > > > -- > Fedora-xen mailing list > Fedora-xen@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/fedora-xen > -- Fedora-xen mailing list Fedora-xen@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/fedora-xen