Re: [edk2] apparent KVM problem with LRET in TianoCore S3 resume trampoline

Paolo Bonzini <pbonzini@xxxxxxxxxx> · Fri, 06 Dec 2013 14:31:03 +0100

Il 06/12/2013 13:03, Paolo Bonzini ha scritto:
> The page tables are, ahem, crap:
> 
> 000c000: 6750 fe01 0000 0000 0000 0000 0000 0000  gP..............
> 000c010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c060: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c070: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c080: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c090: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c0a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c0b0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c0c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c0d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c0e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c0f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 
> This is 0x9c000.  Strikes any bell?

Uh-oh, actually it's fine and it's my turn to say I didn't look far
enough.

That far jump is not where it's failing.  It's quite close, but it's
at a much more interesting place.  Still, indeed it's OVMF's fault.

I tried tracing again, this time without unrestricted_guest.  I wanted to
see emulation around the time we enter long mode, but it went a little past
that place.

We get interesting results anyway because the EPT tables are rebuilt on
changes to CR3.PG=0:

 qemu-system-x86-6785  [003] 67184.361164: kvm_exit:             reason EPT_VIOLATION rip 0x9aeca info 81 0
 qemu-system-x86-6785  [003] 67184.361165: kvm_page_fault:       address 9c000 error_code 81

	level 4

 qemu-system-x86-6785  [003] 67184.361165: kvm_entry:            vcpu 0
 qemu-system-x86-6785  [003] 67184.361166: kvm_exit:             reason EPT_VIOLATION rip 0x9aeca info 81 0
 qemu-system-x86-6785  [003] 67184.361166: kvm_page_fault:       address 1fe5000 error_code 81
 qemu-system-x86-6785  [003] 67184.361168: kvm_mmu_get_page:     new sp gfn 1e00 0/1 q0 direct --- !pge !nxe root 0 sync

	level 3

 qemu-system-x86-6785  [003] 67184.361169: kvm_entry:            vcpu 0
 qemu-system-x86-6785  [003] 67184.361169: kvm_exit:             reason EPT_VIOLATION rip 0x9aeca info 81 0
 qemu-system-x86-6785  [003] 67184.361169: kvm_page_fault:       address 1fe6000 error_code 81

	level 2

 qemu-system-x86-6785  [003] 67184.361170: kvm_entry:            vcpu 0
 qemu-system-x86-6785  [003] 67184.361171: kvm_exit:             reason EPT_VIOLATION rip 0x9aeca info 81 0
 qemu-system-x86-6785  [003] 67184.361171: kvm_page_fault:       address 1fe74d0 error_code 81

	level 1 (note 0x4D0 means the 0x4D*2=0x9A-th entry, i.e virtual address 0x9A000)

Another way to get this information would be more simply to attach gdb to the running
machine.  On one hand setting breakpoints is easy (remember they are virtual addresses,
and always use hardware breakpoints with "hb" so that you do not touch memory).  But
it's complicated to use gdb across mode switches, and we're quite lucky that tracing
got us fast what we need!

 qemu-system-x86-6785  [003] 67184.361171: kvm_entry:            vcpu 0
 qemu-system-x86-6785  [003] 67184.361172: kvm_exit:             reason EPT_VIOLATION rip 0xffffffff81000110 info 81 0
 qemu-system-x86-6785  [003] 67184.361172: kvm_page_fault:       address 1c0fff0 error_code 81

	level 4

 qemu-system-x86-6785  [003] 67184.361173: kvm_mmu_get_page:     new sp gfn 1c00 0/1 q0 direct --- !pge !nxe root 0 sync
 qemu-system-x86-6785  [003] 67184.361174: kvm_entry:            vcpu 0
 qemu-system-x86-6785  [003] 67184.361174: kvm_exit:             reason EPT_VIOLATION rip 0xffffffff81000110 info 81 0
 qemu-system-x86-6785  [003] 67184.361174: kvm_page_fault:       address 1c10040 error_code 81

	level 3.  We should be here:

	   0xffffffff81000110:	mov    $0x1c0c000,%rax
	   0xffffffff81000117:	mov    $0xa0,%ecx
	   0xffffffff8100011c:	mov    %rcx,%cr4
	   0xffffffff8100011f:	add    0xc12eea(%rip),%rax        # 0xffffffff81c13010
	   0xffffffff81000126:	mov    %rax,%cr3
	   0xffffffff81000129:	mov    $0xffffffff81000132,%rax
	   0xffffffff81000130:	jmpq   *%rax

	(grabbed from "dump-guest-memory -p" and gdb's disass command,
	right after suspending the system)

 qemu-system-x86-6785  [003] 67184.361175: kvm_entry:            vcpu 0
 qemu-system-x86-6785  [003] 67184.361176: kvm_exit:             reason EPT_VIOLATION rip 0xffffffff81000113 info 181 0
 qemu-system-x86-6785  [003] 67184.361176: kvm_page_fault:       address 48 error_code 181

	this rip is bogus!  Let's grab another "dump-guest-memory -p", this
	time after shutdown; remember I'm using -no-shutdown -no-reboot:

	   0xffffffff81000110:	mov    -0x18(%rbp),%eax
	   0xffffffff81000113:	mov    0x48(%rax),%rax
	   0xffffffff81000117:	mov    -0x30(%rbp),%rsi
	   0xffffffff8100011b:	lea    -0x48(%rbp),%rdi
	   0xffffffff8100011f:	mov    -0x18(%rbp),%rcx
	   0xffffffff81000123:	lea    -0x40(%rbp),%rdx
	   0xffffffff81000127:	mov    %rdx,0x28(%rsp)
	   0xffffffff8100012c:	lea    -0x38(%rbp),%rdx

	Uh oh.  Something is corrupting virtual address 0xffffffff81000110,
	which corresponds to physical address 0x1000110.

 qemu-system-x86-6785  [003] 67184.361177: kvm_entry:            vcpu 0
 qemu-system-x86-6785  [003] 67184.361177: kvm_exit:             reason EPT_VIOLATION rip 0xffffffff81000127 info 182 0

	This rip is also bogus, no surprise it triple faults soon

 qemu-system-x86-6785  [003] 67184.361177: kvm_page_fault:       address 9e048 error_code 182
 qemu-system-x86-6785  [003] 67184.361178: kvm_entry:            vcpu 0
 qemu-system-x86-6785  [003] 67184.361179: kvm_exit:             reason TRIPLE_FAULT rip 0x0 info 0 0

Still an EDK2 problem.  Perhaps you can dump the first few bytes of
0x1000110..0x10011f every time a PEIM is loaded?

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html