Hi Paolo I am a little confused here. You said "Still, indeed it's OVMF's fault." and "Still an EDK2 problem." ?????? EDKII BIOS should always create 1:1 mapping virtual-physical address. But I am not clear about OS waking vector. For "EPT_VIOLATION rip 0xffffffff81000110.", is that happen in EDKII BIOS or in OS waking vector? All in all, I have interesting to know one thing at first: Is OVMF crash in BIOS before jump to OS waking vector? Or is OVMF crash inside OS waking vector? Thank you Yao Jiewen -----Original Message----- From: Paolo Bonzini [mailto:pbonzini@xxxxxxxxxx] Sent: Friday, December 06, 2013 9:31 PM Cc: edk2-devel@xxxxxxxxxxxxxxxxxxxxx; KVM devel mailing list Subject: Re: [edk2] apparent KVM problem with LRET in TianoCore S3 resume trampoline Il 06/12/2013 13:03, Paolo Bonzini ha scritto: > The page tables are, ahem, crap: > > 000c000: 6750 fe01 0000 0000 0000 0000 0000 0000 gP.............. > 000c010: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > 000c020: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > 000c030: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > 000c040: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > 000c050: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > 000c060: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > 000c070: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > 000c080: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > 000c090: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > 000c0a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > 000c0b0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > 000c0c0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > 000c0d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > 000c0e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > 000c0f0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > > This is 0x9c000. Strikes any bell? Uh-oh, actually it's fine and it's my turn to say I didn't look far enough. That far jump is not where it's failing. It's quite close, but it's at a much more interesting place. Still, indeed it's OVMF's fault. I tried tracing again, this time without unrestricted_guest. I wanted to see emulation around the time we enter long mode, but it went a little past that place. We get interesting results anyway because the EPT tables are rebuilt on changes to CR3.PG=0: qemu-system-x86-6785 [003] 67184.361164: kvm_exit: reason EPT_VIOLATION rip 0x9aeca info 81 0 qemu-system-x86-6785 [003] 67184.361165: kvm_page_fault: address 9c000 error_code 81 level 4 qemu-system-x86-6785 [003] 67184.361165: kvm_entry: vcpu 0 qemu-system-x86-6785 [003] 67184.361166: kvm_exit: reason EPT_VIOLATION rip 0x9aeca info 81 0 qemu-system-x86-6785 [003] 67184.361166: kvm_page_fault: address 1fe5000 error_code 81 qemu-system-x86-6785 [003] 67184.361168: kvm_mmu_get_page: new sp gfn 1e00 0/1 q0 direct --- !pge !nxe root 0 sync level 3 qemu-system-x86-6785 [003] 67184.361169: kvm_entry: vcpu 0 qemu-system-x86-6785 [003] 67184.361169: kvm_exit: reason EPT_VIOLATION rip 0x9aeca info 81 0 qemu-system-x86-6785 [003] 67184.361169: kvm_page_fault: address 1fe6000 error_code 81 level 2 qemu-system-x86-6785 [003] 67184.361170: kvm_entry: vcpu 0 qemu-system-x86-6785 [003] 67184.361171: kvm_exit: reason EPT_VIOLATION rip 0x9aeca info 81 0 qemu-system-x86-6785 [003] 67184.361171: kvm_page_fault: address 1fe74d0 error_code 81 level 1 (note 0x4D0 means the 0x4D*2=0x9A-th entry, i.e virtual address 0x9A000) Another way to get this information would be more simply to attach gdb to the running machine. On one hand setting breakpoints is easy (remember they are virtual addresses, and always use hardware breakpoints with "hb" so that you do not touch memory). But it's complicated to use gdb across mode switches, and we're quite lucky that tracing got us fast what we need! qemu-system-x86-6785 [003] 67184.361171: kvm_entry: vcpu 0 qemu-system-x86-6785 [003] 67184.361172: kvm_exit: reason EPT_VIOLATION rip 0xffffffff81000110 info 81 0 qemu-system-x86-6785 [003] 67184.361172: kvm_page_fault: address 1c0fff0 error_code 81 level 4 qemu-system-x86-6785 [003] 67184.361173: kvm_mmu_get_page: new sp gfn 1c00 0/1 q0 direct --- !pge !nxe root 0 sync qemu-system-x86-6785 [003] 67184.361174: kvm_entry: vcpu 0 qemu-system-x86-6785 [003] 67184.361174: kvm_exit: reason EPT_VIOLATION rip 0xffffffff81000110 info 81 0 qemu-system-x86-6785 [003] 67184.361174: kvm_page_fault: address 1c10040 error_code 81 level 3. We should be here: 0xffffffff81000110: mov $0x1c0c000,%rax 0xffffffff81000117: mov $0xa0,%ecx 0xffffffff8100011c: mov %rcx,%cr4 0xffffffff8100011f: add 0xc12eea(%rip),%rax # 0xffffffff81c13010 0xffffffff81000126: mov %rax,%cr3 0xffffffff81000129: mov $0xffffffff81000132,%rax 0xffffffff81000130: jmpq *%rax (grabbed from "dump-guest-memory -p" and gdb's disass command, right after suspending the system) qemu-system-x86-6785 [003] 67184.361175: kvm_entry: vcpu 0 qemu-system-x86-6785 [003] 67184.361176: kvm_exit: reason EPT_VIOLATION rip 0xffffffff81000113 info 181 0 qemu-system-x86-6785 [003] 67184.361176: kvm_page_fault: address 48 error_code 181 this rip is bogus! Let's grab another "dump-guest-memory -p", this time after shutdown; remember I'm using -no-shutdown -no-reboot: 0xffffffff81000110: mov -0x18(%rbp),%eax 0xffffffff81000113: mov 0x48(%rax),%rax 0xffffffff81000117: mov -0x30(%rbp),%rsi 0xffffffff8100011b: lea -0x48(%rbp),%rdi 0xffffffff8100011f: mov -0x18(%rbp),%rcx 0xffffffff81000123: lea -0x40(%rbp),%rdx 0xffffffff81000127: mov %rdx,0x28(%rsp) 0xffffffff8100012c: lea -0x38(%rbp),%rdx Uh oh. Something is corrupting virtual address 0xffffffff81000110, which corresponds to physical address 0x1000110. qemu-system-x86-6785 [003] 67184.361177: kvm_entry: vcpu 0 qemu-system-x86-6785 [003] 67184.361177: kvm_exit: reason EPT_VIOLATION rip 0xffffffff81000127 info 182 0 This rip is also bogus, no surprise it triple faults soon qemu-system-x86-6785 [003] 67184.361177: kvm_page_fault: address 9e048 error_code 182 qemu-system-x86-6785 [003] 67184.361178: kvm_entry: vcpu 0 qemu-system-x86-6785 [003] 67184.361179: kvm_exit: reason TRIPLE_FAULT rip 0x0 info 0 0 Still an EDK2 problem. Perhaps you can dump the first few bytes of 0x1000110..0x10011f every time a PEIM is loaded? Paolo ------------------------------------------------------------------------------ Sponsored by Intel(R) XDK Develop, test and display web and hybrid apps with a single code base. Download it for free now! http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk _______________________________________________ edk2-devel mailing list edk2-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.sourceforge.net/lists/listinfo/edk2-devel -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html