> On Sep 13, 2024, at 1:28 AM, Chao Gao <chao.gao@xxxxxxxxx> wrote: > > !-------------------------------------------------------------------| > CAUTION: External Email > > |-------------------------------------------------------------------! > > On Thu, Sep 12, 2024 at 09:24:40AM -0700, Pawan Gupta wrote: >> On Thu, Sep 12, 2024 at 03:44:38PM +0000, Jon Kohler wrote: >>>> It is only worth implementing the long sequence in VMEXIT_ONLY mode if it is >>>> significantly better than toggling the MSR. >>> >>> Thanks for the pointer! I hadn’t seen that second sequence. I’ll do measurements on >>> three cases and come back with data from an SPR system. >>> 1. as-is (wrmsr on entry and exit) >>> 2. Short sequence (as a baseline) >>> 3. Long sequence >> Pawan, Thanks for the pointer to the long sequence. I've tested it along with Listing 3 (TSX Abort sequence) using KUT tscdeadline_immed test. TSX abort sequence performs better unless BHI mitigation is off or host/guest spec_ctrl values match, avoiding WRMSR toggling. Having the values match the DIS_S value is easier said than done across a fleet that is already using eIBRS heavily. Test System: - Intel Xeon Gold 6442Y, microcode 0x2b0005c0 - Linux 6.6.34 + patches, qemu 8.2 - KVM Unit Tests @ latest (17f6f2fd) with tscdeadline_immed + edits: - Toggle spec ctrl before test in main() - Use cpu type SapphireRapids-v2 Test string: TESTNAME=vmexit_tscdeadline_immed TIMEOUT=90s MACHINE= ACCEL= taskset -c 26 ./x86/run x86/vmexit.flat \ -smp 1 -cpu SapphireRapids-v2,+x2apic,+tsc-deadline -append tscdeadline_immed |grep tscdeadline Test Results: 1. spectre_bhi=on, host spec_ctrl=1025, guest spec_ctrl=1: tscdeadline_immed 3878 (WRMSR toggling) 2. spectre_bhi=on, host spec_ctrl=1025, guest spec_ctrl=1025: tscdeadline_immed 3153 (no WRMSR toggling) 3. spectre_bhi=vmexit, BHB long sequence, host/guest spec_ctrl=1: tscdeadline_immed 3629 (still better than test 1, penalty only on exit) 4. spectre_bhi=vmexit, TSX abort sequence, host/guest spec_ctrl=1: tscdeadline_immed 3294 (best general purpose performance) 5. spectre_bhi=vmexit, TSX abort sequence, host spec_ctrl=1, guest spec_ctrl=1025: tscdeadline_immed 4011 (needs optimization) In short, there is a significant speedup to be had here. As for test 5, honest that is somewhat invalid because it would be dependent on the VMM user space showing BHI_CTRL. QEMU as an example does not do that, so even with latest qemu and latest kernel, guests will still use BHB loop even on SPR++ today, and they could use the TSX loop with this proposed change if the VMM exposes RTM feature. I'm happy to post a V2 patch with my TSX changes, or take any other suggestions here. Thanks all, Jon >> I wonder if virtual SPEC_CTRL feature introduced in below series can >> provide speedup, as it can replace the MSR toggling with faster VMCS >> operations: > > "virtual SPEC_CTRL" won't provide speedup. the wrmsr on entry/exit is still > need if guest's (effective) value and host's value are different. > > "virtual SPEC_CTRL" just prevents guests from toggling some bits. It doesn't > switch the MSR between guest value and host value on entry/exit. so, KVM has > to do the switching with wrmsr/rdmsr instructions. A new feature, "load > IA32_SPEC_CTRL" VMX control (refer to Chapter 15 in ISE spec[*]), can help but > it isn't supported on SPR. > > [*]: https://urldefense.proofpoint.com/v2/url?u=https-3A__cdrdv2.intel.com_v1_dl_getContent_671368&d=DwIDaQ&c=s883GpUCOChKOHiocYtGcg&r=NGPRGGo37mQiSXgHKm5rCQ&m=c7SFjczyXeO5McE4firUZaiOVuLBVwLXAzKV9WQqMqKCCEwSvVk0V4cko-falQYo&s=-hskrlhrR4iuT2sz0KkGJn7hCSAGIteu3_TGQzPgh8I&e= > >> >> https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_kvm_20240410143446.797262-2D1-2Dchao.gao-40intel.com_&d=DwIDaQ&c=s883GpUCOChKOHiocYtGcg&r=NGPRGGo37mQiSXgHKm5rCQ&m=c7SFjczyXeO5McE4firUZaiOVuLBVwLXAzKV9WQqMqKCCEwSvVk0V4cko-falQYo&s=rsaEdAN9KEjtAMSN-ke4x4R87FgfxsvCsdwbCFk7VOE&e= >> >> Adding Chao for their opinion.