Re: [PATCH 2/3] Fix nested VMX TSC emulation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Comments inline.  Sorry for top-posting.  Gmail is not my normal mode
of LKML processing, but hey.

On Tue, Aug 2, 2011 at 5:54 AM, Nadav Har'El <nyh@xxxxxxxxxx> wrote:
> This patch fixes two corner cases in nested (L2) handling of TSC-related
> issues:
>
> 1. Somewhat suprisingly, according to the Intel spec, if L1 allows WRMSR to
> the TSC MSR without an exit, then this should set L1's TSC value itself - not
> offset by vmcs12.TSC_OFFSET (like was wrongly done in the previous code).
>
> 2. Allow L1 to disable the TSC_OFFSETING control, and then correctly ignore
> the vmcs12.TSC_OFFSET.
>
> Signed-off-by: Nadav Har'El <nyh@xxxxxxxxxx>
> ---
>  arch/x86/kvm/vmx.c |   31 +++++++++++++++++++++----------
>  1 file changed, 21 insertions(+), 10 deletions(-)
>
> --- .before/arch/x86/kvm/vmx.c  2011-08-02 15:51:02.000000000 +0300
> +++ .after/arch/x86/kvm/vmx.c   2011-08-02 15:51:02.000000000 +0300
> @@ -1777,15 +1777,23 @@ static void vmx_set_tsc_khz(struct kvm_v
>  */
>  static void vmx_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
>  {
> -       vmcs_write64(TSC_OFFSET, offset);
> -       if (is_guest_mode(vcpu))
> +       if (is_guest_mode(vcpu)) {
>                /*
> -                * We're here if L1 chose not to trap the TSC MSR. Since
> -                * prepare_vmcs12() does not copy tsc_offset, we need to also
> -                * set the vmcs12 field here.
> +                * We're here if L1 chose not to trap WRMSR to TSC. According
> +                * to the spec, this should set L1's TSC; The offset that L1
> +                * set for L2 remains unchanged, and still needs to be added
> +                * to the newly set TSC to get L2's TSC.
>                 */
> -               get_vmcs12(vcpu)->tsc_offset = offset -
> -                       to_vmx(vcpu)->nested.vmcs01_tsc_offset;
> +               struct vmcs12 *vmcs12;
> +               to_vmx(vcpu)->nested.vmcs01_tsc_offset = offset;
> +               /* recalculate vmcs02.TSC_OFFSET: */
> +               vmcs12 = get_vmcs12(vcpu);
> +               vmcs_write64(TSC_OFFSET, offset +
> +                       (nested_cpu_has(vmcs12, CPU_BASED_USE_TSC_OFFSETING) ?
> +                        vmcs12->tsc_offset : 0));
> +       } else {
> +               vmcs_write64(TSC_OFFSET, offset);
> +       }
>  }

This part looks good.

>  static void vmx_adjust_tsc_offset(struct kvm_vcpu *vcpu, s64 adjustment)
> @@ -6529,8 +6537,11 @@ static void prepare_vmcs02(struct kvm_vc
>
>        set_cr4_guest_host_mask(vmx);
>
> -       vmcs_write64(TSC_OFFSET,
> -               vmx->nested.vmcs01_tsc_offset + vmcs12->tsc_offset);
> +       if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING)
> +               vmcs_write64(TSC_OFFSET,
> +                       vmx->nested.vmcs01_tsc_offset + vmcs12->tsc_offset);
> +       else
> +               vmcs_write64(TSC_OFFSET, vmx->nested.vmcs01_tsc_offset);

I need more context here... where do you apply the adjustment?

The offset should be added to the vmcs01_tsc_offset only (but also
written into the hardware VMCS, which should not be preserved when the
guest exits).

>
>        if (enable_vpid) {
>                /*
> @@ -6937,7 +6948,7 @@ static void nested_vmx_vmexit(struct kvm
>
>        load_vmcs12_host_state(vcpu, vmcs12);
>
> -       /* Update TSC_OFFSET if vmx_adjust_tsc_offset() was used while L2 ran */
> +       /* Update TSC_OFFSET if TSC was changed while L2 ran */
>        vmcs_write64(TSC_OFFSET, vmx->nested.vmcs01_tsc_offset);
>
>        /* This is needed for same reason as it was needed in prepare_vmcs02 */
>

This is correct.  You should always restore the L1 offset when exiting
if it might have changed.  This implies also that you must update
vmx->nested.vmcs01_tsc_offset if you receive a call to
vmx_adjust_tsc_offset while L2 is running, which is why I wanted to
see more context above.

Zach
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux