On Thu, Nov 17, 2022, Maxim Levitsky wrote: > Clean up the nested_sync_int_ctl_from_vmcb02: > > 1. The comment about preservation of V_IRQ is wrong: when the L2 doesn't > use virtual interrupt masking, then the field just doesn't exist in > vmcb12 thus it should not be touched at all. > Since it is unused in this case, touching it doesn't matter that much, > so the bug is theoretical. > > 2. When the L2 doesn't use virtual interrupt masking, then in the *theory* > if KVM uses the feature, it should copy the changes to V_IRQ* bits from > vmcb02 to vmcb01. > > In practise, KVM only uses it for detection of the interrupt window, > and it happens to re-open it on each nested VM exit because > kvm_set_rflags happens to raise the KVM_REQ_EVENT. > Do this explicitly. > > 3. Add comment on why we don't need to copy V_GIF from vmcb02 to vmcb01 > when nested guest doesn't use nested V_GIF (and thus L1's GIF is in > vmcb02 while nested), even though it can in theory affect L1's GIF. > > 4. Add support code to also copy some bits of int_ctl from > vmcb02 to vmcb01. > Currently there are none. Unless it's impossible for whatever reason, this patch should be split into multiple patches. IIUC, there are at least 2 different functional changes being made, they just happen to not have any actual impact on things. > No (visible) functional change is intended. > > Signed-off-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx> > --- > arch/x86/kvm/svm/nested.c | 47 ++++++++++++++++++++++++++------------- > 1 file changed, 32 insertions(+), 15 deletions(-) > > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c > index 54eb152e2b60b6..1f2b8492c8782f 100644 > --- a/arch/x86/kvm/svm/nested.c > +++ b/arch/x86/kvm/svm/nested.c > @@ -410,28 +410,45 @@ void nested_copy_vmcb_save_to_cache(struct vcpu_svm *svm, > static void nested_sync_int_ctl_from_vmcb02(struct vcpu_svm *svm, > struct vmcb *vmcb12) > { > - u32 mask; > + struct vmcb *vmcb02 = svm->nested.vmcb02.ptr; > + struct vmcb *vmcb01 = svm->vmcb01.ptr; > + > + /* bitmask of bits of int_ctl that we copy from vmcb02 to vmcb12*/ > + u32 l2_to_l1_mask = 0; > + /* bitmask of bits of int_ctl that we copy from vmcb02 to vmcb01*/ > + u32 l2_to_l0_mask = 0; > > - /* Only a few fields of int_ctl are written by the processor. */ Can this comment be kept in some form? I found it super useful when reading this code just now. > - mask = V_IRQ_MASK | V_TPR_MASK; > - if (!(svm->nested.ctl.int_ctl & V_INTR_MASKING_MASK) && > - svm_is_intercept(svm, INTERCEPT_VINTR)) { > + if (svm->nested.ctl.int_ctl & V_INTR_MASKING_MASK) > + l2_to_l1_mask |= V_IRQ_MASK | V_TPR_MASK; > + else { > /* > - * In order to request an interrupt window, L0 is usurping > - * svm->vmcb->control.int_ctl and possibly setting V_IRQ > - * even if it was clear in L1's VMCB. Restoring it would be > - * wrong. However, in this case V_IRQ will remain true until > - * interrupt_window_interception calls svm_clear_vintr and > - * restores int_ctl. We can just leave it aside. > + * If IRQ window was opened while in L2, it must be reopened > + * after the VM exit > + * > + * vTPR value doesn't need to be copied from vmcb02 to vmcb01 > + * because it is synced from/to apic registers on each VM exit > */ > - mask &= ~V_IRQ_MASK; > + if (vmcb02->control.int_ctl & V_IRQ_MASK) > + kvm_make_request(KVM_REQ_EVENT, &svm->vcpu); > } > > if (nested_vgif_enabled(svm)) > - mask |= V_GIF_MASK; > + l2_to_l1_mask |= V_GIF_MASK; > + else > + /* There is no need to sync V_GIF from vmcb02 to vmcb01 > + * because GIF is cleared on VMexit, thus even though > + * nested guest can control host's GIF, on VM exit > + * its set value is lost > + */ > + ; The "else ... ;" is unnecessary, just throw the block comment above the nested vGIF if-statment, e.g. if I'm understanding everything, this? /* * If nested vGIF is not enabled, L2 has access to L1's "real" GIF. In * this case, there's no need to sync V_GIF from vmcb02 to vmcb01 * because GIF is cleared on VM-Exit, thus any changes made by L2 are * overwritten on VM-Exit to L1. */ if (nested_vgif_enabled(svm)) l2_to_l1_mask |= V_GIF_MASK; > + > + vmcb12->control.int_ctl = > + (svm->nested.ctl.int_ctl & ~l2_to_l1_mask) | > + (vmcb02->control.int_ctl & l2_to_l1_mask); > > - vmcb12->control.int_ctl &= ~mask; > - vmcb12->control.int_ctl |= svm->vmcb->control.int_ctl & mask; > + vmcb01->control.int_ctl = > + (vmcb01->control.int_ctl & ~l2_to_l0_mask) | > + (vmcb02->control.int_ctl & l2_to_l0_mask); No need for wrapping immediately after the "=", these all fit under the soft limit: vmcb12->control.int_ctl = (svm->nested.ctl.int_ctl & ~l2_to_l1_mask) | (vmcb02->control.int_ctl & l2_to_l1_mask); vmcb01->control.int_ctl = (vmcb01->control.int_ctl & ~l2_to_l0_mask) | (vmcb02->control.int_ctl & l2_to_l0_mask);