When using SNP, accessing an encrypted guest page from the host triggers an RMP fault. The page fault handling code can currently handle this by looking up the corresponding rmp entry. If the same operation happens when using nested virtualization, the L0 hypervisor sees a #NPF but the CPU does not provide the address of the fault if the CPU was running at L1 at the time of the fault. This happens on Hyper-V when using nested SNP guests. Hyper-V has no choice but to use a placeholder address (0) when injecting the page fault to L1. We need to handle this, and the only sane thing to do is to forward a SIGBUS to the task. One path where this happens is when the SNP guest issues a KVM_HC_CLOCK_PAIRING hypercall, which leads to KVM calling kvm_write_guest() on a guest supplied address. This results in the following backtrace: [ 191.862660] exc_page_fault+0x71/0x170 [ 191.862664] asm_exc_page_fault+0x2c/0x40 [ 191.862666] RIP: 0010:copy_user_enhanced_fast_string+0xa/0x40 ... [ 191.862677] ? __kvm_write_guest_page+0x6e/0xa0 [kvm] [ 191.862700] kvm_write_guest_page+0x52/0xc0 [kvm] [ 191.862788] kvm_write_guest+0x44/0x80 [kvm] [ 191.862807] kvm_emulate_hypercall+0x1ca/0x5a0 [kvm] [ 191.862830] ? kvm_emulate_monitor+0x40/0x40 [kvm] [ 191.862849] svm_invoke_exit_handler+0x74/0x180 [kvm_amd] [ 191.862854] sev_handle_vmgexit+0xf42/0x17f0 [kvm_amd] [ 191.862858] ? __this_cpu_preempt_check+0x13/0x20 [ 191.862860] ? sev_post_map_gfn+0xf0/0xf0 [kvm_amd] [ 191.862863] svm_invoke_exit_handler+0x74/0x180 [kvm_amd] [ 191.862866] svm_handle_exit+0xb5/0x2b0 [kvm_amd] [ 191.862869] kvm_arch_vcpu_ioctl_run+0x12a8/0x1aa0 [kvm] [ 191.862891] kvm_vcpu_ioctl+0x24f/0x6d0 [kvm] [ 191.862910] ? kvm_vm_ioctl_irq_line+0x27/0x40 [kvm] [ 191.862929] ? _copy_to_user+0x25/0x30 [ 191.862932] ? kvm_vm_ioctl+0x291/0xea0 [kvm] [ 191.862951] ? kvm_vm_ioctl+0x291/0xea0 [kvm] [ 191.862970] ? __fget_light+0xc5/0x100 [ 191.862972] __x64_sys_ioctl+0x91/0xc0 [ 191.862975] do_syscall_64+0x5c/0x80 [ 191.862976] ? exit_to_user_mode_prepare+0x53/0x240 [ 191.862978] ? syscall_exit_to_user_mode+0x17/0x40 [ 191.862980] ? do_syscall_64+0x69/0x80 [ 191.862981] ? do_syscall_64+0x69/0x80 [ 191.862982] ? syscall_exit_to_user_mode+0x17/0x40 [ 191.862983] ? do_syscall_64+0x69/0x80 [ 191.862984] ? syscall_exit_to_user_mode+0x17/0x40 [ 191.862985] ? do_syscall_64+0x69/0x80 [ 191.862986] ? do_syscall_64+0x69/0x80 [ 191.862987] entry_SYSCALL_64_after_hwframe+0x46/0xb0 Without this fix the handler returns without doing anything and the result is a soft-lockup of the CPU. Signed-off-by: Jeremi Piotrowski <jpiotrowski@xxxxxxxxxxxxxxxxxxx> --- arch/x86/mm/fault.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index f2b16dcfbd9a..8706fd34f3a9 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -34,6 +34,7 @@ #include <asm/vdso.h> /* fixup_vdso_exception() */ #include <asm/irq_stack.h> #include <asm/sev.h> /* snp_lookup_rmpentry() */ +#include <asm/hypervisor.h> /* hypervisor_is_type() */ #define CREATE_TRACE_POINTS #include <asm/trace/exceptions.h> @@ -1282,6 +1283,18 @@ static int handle_user_rmp_page_fault(struct pt_regs *regs, unsigned long error_ pte_t *pte; u64 pfn; + /* + * When an rmp fault occurs while not inside the SNP guest, the L0 + * hypervisor sees a NPF and does not have access to the address that + * caused the fault to forward to L1 hypervisor. Hyper-V places a 0 in + * the PF as a placeholder. SIGBUS the task since there's nothing + * better that we can do. + */ + if (!address && hypervisor_is_type(X86_HYPER_MS_HYPERV)) { + do_sigbus(regs, error_code, address, VM_FAULT_SIGBUS); + return 1; + } + pgd = __va(read_cr3_pa()); pgd += pgd_index(address); -- 2.25.1