> On 11 Nov 2019, at 17:02, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > > On 11/11/19 13:30, Liran Alon wrote: >> When L1 don't use TPR-Shadow to run L2, L0 configures vmcs02 without >> TPR-Shadow and install intercepts on CR8 access (load and store). >> >> If L1 do not intercept L2 CR8 access, L0 intercepts on those accesses >> will emulate load/store on L1's LAPIC TPR. If in this case L2 lowers >> TPR such that there is now an injectable interrupt to L1, >> apic_update_ppr() will request a KVM_REQ_EVENT which will trigger a call >> to update_cr8_intercept() to update TPR-Threshold to highest pending IRR >> priority. >> >> However, this update to TPR-Threshold is done while active vmcs is >> vmcs02 instead of vmcs01. Thus, when later at some point L0 will >> emulate an exit from L2 to L1, L1 will still run with high >> TPR-Threshold. This will result in every VMEntry to L1 to immediately >> exit on TPR_BELOW_THRESHOLD and continue to do so infinitely until >> some condition will cause KVM_REQ_EVENT to be set. >> (Note that TPR_BELOW_THRESHOLD exit handler do not set KVM_REQ_EVENT >> until apic_update_ppr() will notice a new injectable interrupt for PPR) >> >> To fix this issue, change update_cr8_intercept() such that if L2 lowers >> L1's TPR in a way that requires to lower L1's TPR-Threshold, save update >> to TPR-Threshold and apply it to vmcs01 when L0 emulates an exit from >> L2 to L1. > > Can you explain why the write shouldn't be done to vmcs02 as well? > > Paolo Because when L1 don’t use TPR-Shadow, L0 configures vmcs02 without TPR-Shadow. Thus, writing to vmcs02->tpr_threshold doesn’t have any effect. If l1 do use TPR-Shadow, then VMX’s update_cr8_intercept() doesn’t write to vmcs at all, because it means L1 defines a vTPR for L2 and thus doesn’t provide it direct access to L1 TPR. -Liran > >> - vmcs_write32(TPR_THRESHOLD, tpr_threshold); >> + >> + if (is_guest_mode(vcpu)) >> + to_vmx(vcpu)->nested.l1_tpr_threshold = tpr_threshold; >> + else >> + vmcs_write32(TPR_THRESHOLD, tpr_threshold); >> } >> >> void vmx_set_virtual_apic_mode(struct kvm_vcpu *vcpu) >> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h >> index bee16687dc0b..43331dfafffe 100644 >> --- a/arch/x86/kvm/vmx/vmx.h >> +++ b/arch/x86/kvm/vmx/vmx.h >> @@ -167,6 +167,9 @@ struct nested_vmx { >> u64 vmcs01_debugctl; >> u64 vmcs01_guest_bndcfgs; >> >> + /* to migrate it to L1 if L2 writes to L1's CR8 directly */ >> + int l1_tpr_threshold; >> + >> u16 vpid02; >> u16 last_vpid; >> >> >