Hi, thanks for the reviews. However, upon thinking some more about this I'd like to withdraw version v2 of the patch. See below: On Fri, Oct 19, 2018 at 08:40:08AM +0800, Wanpeng Li wrote: > > As the current hardware value of %dr6 is already in vcpu->arch.dr6 > > ... This part is not true. L1 can start debugging itself and stop again or simply write to %dr6. This will result in an arbitrary value in vcpu->arch.dr6 and vcpu->arch.switch_db_regs will be zero. In this situation we don't want to clear %dr6 as it cannot be changed by the guest. However, we cannot detect this situation by looking at vcpu->arch.dr6. Instead track a guest entry that potentially loads a guest value into %dr6 in switch_db_regs explicitly. This has the downside that the check before kvm_x86_ops->run(vcpu) is a bit more ugly. So below is v3 of the patch. Tested on the same nested setup as v2. regards Christian >From d8153e6581570c7be2e5e65ae2e2f700ce1ea3b0 Mon Sep 17 00:00:00 2001 From: Christian Ehrhardt <lk@xxxxxx> Date: Mon, 15 Oct 2018 20:05:48 +0200 Subject: [PATCH 1/2] KVM: x86: Only clear %dr6 if really neccessary Change efdab992: "KVM: x86: fix escape of guest dr6 to the host" has negative effects if Linux runs as the L1 guest in a nested VM setup: Each call to kvm_arch_vcpu_put causes an additional L1->L0 exit due to the %dr6 write even if neither L1 nor L2 uses debug registers. Thus track if we must clear %dr6 in switch_db_regs. Update the comment and reformat it while there. Signed-off-by: Christian Ehrhardt <lk@xxxxxxx> --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/x86.c | 16 +++++++++++----- 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 09b2e3e2cf1b..773133dd9258 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -440,6 +440,7 @@ enum { KVM_DEBUGREG_BP_ENABLED = 1, KVM_DEBUGREG_WONT_EXIT = 2, KVM_DEBUGREG_RELOAD = 4, + KVM_DEBUGREG_GUEST_LOADED = 8, }; struct kvm_mtrr_range { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ca717737347e..e66dbbb1e2c2 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3185,11 +3185,15 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) kvm_x86_ops->vcpu_put(vcpu); vcpu->arch.last_host_tsc = rdtsc(); /* - * If userspace has set any breakpoints or watchpoints, dr6 is restored - * on every vmexit, but if not, we might have a stale dr6 from the - * guest. do_debug expects dr6 to be cleared after it runs, do the same. + * If userspace has set any breakpoints or watchpoints, dr6 is + * restored on every vmexit, but if not, we might have a stale + * dr6 from the guest. Thus reset the hardware value of dr6 to + * its reset value in this case. */ - set_debugreg(0, 6); + if (unlikely(vcpu->arch.switch_db_regs & KVM_DEBUGREG_GUEST_LOADED)) { + set_debugreg(0, 6); + vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_GUEST_LOADED; + } } static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu, @@ -7592,7 +7596,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) wait_lapic_expire(vcpu); guest_enter_irqoff(); - if (unlikely(vcpu->arch.switch_db_regs)) { + if (unlikely(vcpu->arch.switch_db_regs & ~KVM_DEBUGREG_GUEST_LOADED)) { set_debugreg(0, 7); set_debugreg(vcpu->arch.eff_db[0], 0); set_debugreg(vcpu->arch.eff_db[1], 1); @@ -7600,6 +7604,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) set_debugreg(vcpu->arch.eff_db[3], 3); set_debugreg(vcpu->arch.dr6, 6); vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD; + vcpu->arch.switch_db_regs |= KVM_DEBUGREG_GUEST_LOADED; } kvm_x86_ops->run(vcpu); @@ -7617,6 +7622,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) kvm_update_dr6(vcpu); kvm_update_dr7(vcpu); vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD; + vcpu->arch.switch_db_regs |= KVM_DEBUGREG_GUEST_LOADED; } /* -- 2.17.1