On Mon, 2021-05-24 at 18:44 +0000, Sean Christopherson wrote: > On Mon, May 24, 2021, Maxim Levitsky wrote: > > On Fri, 2021-05-21 at 11:24 +0100, Ilias Stamatis wrote: > > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > > index 4b70431c2edd..7c52c697cfe3 100644 > > > --- a/arch/x86/kvm/vmx/vmx.c > > > +++ b/arch/x86/kvm/vmx/vmx.c > > > @@ -1392,9 +1392,8 @@ void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu, int cpu, > > > } > > > > > > /* Setup TSC multiplier */ > > > - if (kvm_has_tsc_control && > > > - vmx->current_tsc_ratio != vcpu->arch.tsc_scaling_ratio) > > > - decache_tsc_multiplier(vmx); > > > + if (kvm_has_tsc_control) > > > + vmcs_write64(TSC_MULTIPLIER, vcpu->arch.tsc_scaling_ratio); > > > > This might have an overhead of writing the TSC scaling ratio even if > > it is unchanged. I haven't measured how expensive vmread/vmwrites are but > > at least when nested, the vmreads/vmwrites can be very expensive (if they > > cause a vmexit). > > > > This is why I think the 'vmx->current_tsc_ratio' exists - to have > > a cached value of TSC scale ratio to avoid either 'vmread'ing > > or 'vmwrite'ing it without a need. Right. I thought the overhead might not be that significant since we're doing lots of vmwrites on vmentry/vmexit anyway, but yeah, why introduce any kind of extra overhead anyway. I'm fine with this particular patch getting dropped. It's not directly related to the series anyway. > > Yes, but its existence is a complete hack. vmx->current_tsc_ratio has the same > scope as vcpu->arch.tsc_scaling_ratio, i.e. vmx == vcpu == vcpu->arch. Unlike > per-VMCS tracking, it should not be useful, keyword "should". > > What I meant by my earlier comment: > > Its use in vmx_vcpu_load_vmcs() is basically "write the VMCS if we forgot to > earlier", which is all kinds of wrong. > > is that vmx_vcpu_load_vmcs() should never write vmcs.TSC_MULTIPLIER. The correct > behavior is to set the field at VMCS initialization, and then immediately set it > whenever the ratio is changed, e.g. on nested transition, from userspace, etc... > In other words, my unclear feedback was to make it obsolete (and drop it) by > fixing the underlying mess, not to just drop the optimization hack. I understood this and replied earlier. The right place for the hw multiplier field to be updated is inside set_tsc_khz() in common code when the ratio changes. However, this requires adding another vendor callback etc. As all this is further refactoring I believe it's better to leave this series as is - ie only touching code that is directly related to nested TSC scaling and not try to do everything as part of the same series. This makes testing easier too. We can still implement these changes later. Thanks, Ilias