On 20/12/18 23:21, Sean Christopherson wrote: > ____kvm_handle_fault_on_reboot() provides a generic exception fixup > handler that is used to cleanly handle faults on VMX/SVM instructions > during reboot (or at least try to). If there isn't a reboot in > progress, ____kvm_handle_fault_on_reboot() treats any exception as > fatal to KVM and invokes kvm_spurious_fault(), which in turn generates > a BUG() to get a stack trace and die. > > When it was originally added by commit 4ecac3fd6dc2 ("KVM: Handle > virtualization instruction #UD faults during reboot"), the "call" to > kvm_spurious_fault() was handcoded as PUSH+JMP, where the PUSH'd value > is the RIP of the faulting instructing. > > The PUSH+JMP trickery is necessary because the exception fixup handler > code lies outside of its associated function, e.g. right after the > function. An actual CALL from the .fixup code would show a slightly > bogus stack trace, e.g. an extra "random" function would be inserted > into the trace, as the return RIP on the stack would point to no known > function (and the unwinder will likely try to guess who owns the RIP). > > Unfortunately, the JMP was replaced with a CALL when the macro was > reworked to not spin indefinitely during reboot (commit b7c4145ba2eb > "KVM: Don't spin on virt instruction faults during reboot"). This > causes the aforementioned behavior where a bogus function is inserted > into the stack trace, e.g. my builds like to blame free_kvm_area(). > > Revert the CALL back to a JMP. The changelog for commit b7c4145ba2eb > ("KVM: Don't spin on virt instruction faults during reboot") contains > nothing that indicates the switch to CALL was deliberate. This is > backed up by the fact that the PUSH <insn RIP> was left intact. > > Note that an alternative to the PUSH+JMP magic would be to JMP back > to the "real" code and CALL from there, but that would require adding > a JMP in the non-faulting path to avoid calling kvm_spurious_fault() > and would add no value, i.e. the stack trace would be the same. > > Using CALL: > > ------------[ cut here ]------------ > kernel BUG at /home/sean/go/src/kernel.org/linux/arch/x86/kvm/x86.c:356! > invalid opcode: 0000 [#1] SMP > CPU: 4 PID: 1057 Comm: qemu-system-x86 Not tainted 4.20.0-rc6+ #75 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 > RIP: 0010:kvm_spurious_fault+0x5/0x10 [kvm] > Code: <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 55 49 89 fd 41 > RSP: 0018:ffffc900004bbcc8 EFLAGS: 00010046 > RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffffffffff > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > RBP: ffff888273fd8000 R08: 00000000000003e8 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000784 R12: ffffc90000371fb0 > R13: 0000000000000000 R14: 000000026d763cf4 R15: ffff888273fd8000 > FS: 00007f3d69691700(0000) GS:ffff888277800000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 000055f89bc56fe0 CR3: 0000000271a5a001 CR4: 0000000000362ee0 > Call Trace: > free_kvm_area+0x1044/0x43ea [kvm_intel] > ? vmx_vcpu_run+0x156/0x630 [kvm_intel] > ? kvm_arch_vcpu_ioctl_run+0x447/0x1a40 [kvm] > ? kvm_vcpu_ioctl+0x368/0x5c0 [kvm] > ? kvm_vcpu_ioctl+0x368/0x5c0 [kvm] > ? __set_task_blocked+0x38/0x90 > ? __set_current_blocked+0x50/0x60 > ? __fpu__restore_sig+0x97/0x490 > ? do_vfs_ioctl+0xa1/0x620 > ? __x64_sys_futex+0x89/0x180 > ? ksys_ioctl+0x66/0x70 > ? __x64_sys_ioctl+0x16/0x20 > ? do_syscall_64+0x4f/0x100 > ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 > Modules linked in: vhost_net vhost tap kvm_intel kvm irqbypass bridge stp llc > ---[ end trace 9775b14b123b1713 ]--- > > Using JMP: > > ------------[ cut here ]------------ > kernel BUG at /home/sean/go/src/kernel.org/linux/arch/x86/kvm/x86.c:356! > invalid opcode: 0000 [#1] SMP > CPU: 6 PID: 1067 Comm: qemu-system-x86 Not tainted 4.20.0-rc6+ #75 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 > RIP: 0010:kvm_spurious_fault+0x5/0x10 [kvm] > Code: <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 55 49 89 fd 41 > RSP: 0018:ffffc90000497cd0 EFLAGS: 00010046 > RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffffffffff > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > RBP: ffff88827058bd40 R08: 00000000000003e8 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000784 R12: ffffc90000369fb0 > R13: 0000000000000000 R14: 00000003c8fc6642 R15: ffff88827058bd40 > FS: 00007f3d7219e700(0000) GS:ffff888277900000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007f3d64001000 CR3: 0000000271c6b004 CR4: 0000000000362ee0 > Call Trace: > vmx_vcpu_run+0x156/0x630 [kvm_intel] > ? kvm_arch_vcpu_ioctl_run+0x447/0x1a40 [kvm] > ? kvm_vcpu_ioctl+0x368/0x5c0 [kvm] > ? kvm_vcpu_ioctl+0x368/0x5c0 [kvm] > ? __set_task_blocked+0x38/0x90 > ? __set_current_blocked+0x50/0x60 > ? __fpu__restore_sig+0x97/0x490 > ? do_vfs_ioctl+0xa1/0x620 > ? __x64_sys_futex+0x89/0x180 > ? ksys_ioctl+0x66/0x70 > ? __x64_sys_ioctl+0x16/0x20 > ? do_syscall_64+0x4f/0x100 > ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 > Modules linked in: vhost_net vhost tap kvm_intel kvm irqbypass bridge stp llc > ---[ end trace f9daedb85ab3ddba ]--- > > Fixes: b7c4145ba2eb ("KVM: Don't spin on virt instruction faults during reboot") > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> > --- > arch/x86/include/asm/kvm_host.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index bca77c25a19a..311763be2e8a 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -1493,7 +1493,7 @@ asmlinkage void kvm_spurious_fault(void); > "cmpb $0, kvm_rebooting \n\t" \ > "jne 668b \n\t" \ > __ASM_SIZE(push) " $666b \n\t" \ > - "call kvm_spurious_fault \n\t" \ > + "jmp kvm_spurious_fault \n\t" \ > ".popsection \n\t" \ > _ASM_EXTABLE(666b, 667b) > > Queued, thanks. Paolo