On Tue, Jun 23, 2020 at 09:21:28AM +0800, Xiaoyao Li wrote: > On 6/23/2020 8:51 AM, Sean Christopherson wrote: > >Remove support for context switching between the guest's and host's > >desired UMWAIT_CONTROL. Propagating the guest's value to hardware isn't > >required for correct functionality, e.g. KVM intercepts reads and writes > >to the MSR, and the latency effects of the settings controlled by the > >MSR are not architecturally visible. > > > >As a general rule, KVM should not allow the guest to control power > >management settings unless explicitly enabled by userspace, e.g. see > >KVM_CAP_X86_DISABLE_EXITS. E.g. Intel's SDM explicitly states that C0.2 > >can improve the performance of SMT siblings. A devious guest could > >disable C0.2 so as to improve the performance of their workloads at the > >detriment to workloads running in the host or on other VMs. > > > >Wholesale removal of UMWAIT_CONTROL context switching also fixes a race > >condition where updates from the host may cause KVM to enter the guest > >with the incorrect value. Because updates are are propagated to all > >CPUs via IPI (SMP function callback), the value in hardware may be > >stale with respect to the cached value and KVM could enter the guest > >with the wrong value in hardware. As above, the guest can't observe the > >bad value, but it's a weird and confusing wart in the implementation. > > > >Removal also fixes the unnecessary usage of VMX's atomic load/store MSR > >lists. Using the lists is only necessary for MSRs that are required for > >correct functionality immediately upon VM-Enter/VM-Exit, e.g. EFER on > >old hardware, or for MSRs that need to-the-uop precision, e.g. perf > >related MSRs. For UMWAIT_CONTROL, the effects are only visible in the > >kernel via TPAUSE/delay(), and KVM doesn't do any form of delay in > >vcpu_vmx_run(). > > >Using the atomic lists is undesirable as they are more > >expensive than direct RDMSR/WRMSR. > > Do you mean the extra handling of atomic list facility in kvm? Or just mean > vm-exit/-entry MSR-load/save in VMX hardware is expensive than direct > RDMSR/WRMSR instruction? Both. The KVM handling is the bigger cost, e.g. requires two VMWRITEs to update the list counts, on top of the list processing. The actual ucode cost is also somewhat expensive if adding an MSR to the list causes the load/store lists to be activated, e.g. on top of the memory accesses for the list, VM-Enter ucode needs to do its consistency checks. Expensive is obviously relative, but as far as the lists are concerned it's an easy penalty to avoid.