> On Mar 25, 2022, at 8:15 PM, Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Wed, Mar 23, 2022, Jon Kohler wrote: >> kvm_load_{guest|host}_xsave_state handles xsave on vm entry and exit, >> part of which is managing memory protection key state. The latest >> arch.pkru is updated with a rdpkru, and if that doesn't match the base >> host_pkru (which about 70% of the time), we issue a __write_pkru. >> >> To improve performance, implement the following optimizations: >> 1. Reorder if conditions prior to wrpkru in both >> kvm_load_{guest|host}_xsave_state. >> >> Flip the ordering of the || condition so that XFEATURE_MASK_PKRU is >> checked first, which when instrumented in our environment appeared >> to be always true and less overall work than kvm_read_cr4_bits. > > If it's always true, then it should be checked last, not first. And if Sean thanks for the review. This would be a left handed || short circuit, so wouldn’t we want always true to be first? > kvm_read_cr4_bits() is more expensive then we should address that issue, not > try to micro-optimize this stuff. X86_CR4_PKE can't be guest-owned, and so > kvm_read_cr4_bits() should be optimized down to: > > return vcpu->arch.cr4 & X86_CR4_PKE; Ok good tip, thank you. I’ll dig into that a bit more and see what I can find. To be fair, kvm_read_cr4_bits isn’t super expensive, it’s just more expensive than the code I proposed. That said, I’ll see what I can do to simplify this. > > If the compiler isn't smart enough to do that on its own then we should rework > kvm_read_cr4_bits() as that will benefit multiple code paths. > >> For kvm_load_guest_xsave_state, hoist arch.pkru != host_pkru ahead >> one position. When instrumented, I saw this be true roughly ~70% of >> the time vs the other conditions which were almost always true. >> With this change, we will avoid 3rd condition check ~30% of the time. > > If the guest uses PKRU... If PKRU is used by the host but not the guest, the > early comparison is pure overhead because it will almost always be true (guest > will be zero, host will non-zero), In a multi tenant environment with a variety of hosts and customer controlled guests, we’ve seen this be a mixed bag. I’d be ok moving this back to the way it is currently, I can do that on the next revision to simplify this. > >> 2. Wrap PKU sections with CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS, >> as if the user compiles out this feature, we should not have >> these branches at all. > > Not that it really matters, since static_cpu_has() will patch out all the branches, > and in practice who cares about a JMP or NOP(s)? But... The reason I’ve been pursuing this is that the guest+host xsave adds up to a bit over ~1% as measured by perf top in an exit heavy workload. This is the first in a few patch we’ve drummed up to to get it back towards zero. I’ll send the rest out next week. > > #ifdeffery is the wrong way to handle this. Replace static_cpu_has() with > cpu_feature_enabled(); that'll boil down to a '0' and omit all the code when > CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS=n, without the #ifdef ugliness. Great idea, thanks. I’ll tune this up and send a v2 patch. > >> Signed-off-by: Jon Kohler <jon@xxxxxxxxxxx> >> --- >> arch/x86/kvm/x86.c | 14 +++++++++----- >> 1 file changed, 9 insertions(+), 5 deletions(-) >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index 6db3a506b402..2b00123a5d50 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -950,11 +950,13 @@ void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu) >> wrmsrl(MSR_IA32_XSS, vcpu->arch.ia32_xss); >> } >> >> +#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS >> if (static_cpu_has(X86_FEATURE_PKU) && >> - (kvm_read_cr4_bits(vcpu, X86_CR4_PKE) || >> - (vcpu->arch.xcr0 & XFEATURE_MASK_PKRU)) && >> - vcpu->arch.pkru != vcpu->arch.host_pkru) >> + vcpu->arch.pkru != vcpu->arch.host_pkru && >> + ((vcpu->arch.xcr0 & XFEATURE_MASK_PKRU) || >> + kvm_read_cr4_bits(vcpu, X86_CR4_PKE))) >> write_pkru(vcpu->arch.pkru); >> +#endif /* CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS */ >> } >> EXPORT_SYMBOL_GPL(kvm_load_guest_xsave_state); >> >> @@ -963,13 +965,15 @@ void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu) >> if (vcpu->arch.guest_state_protected) >> return; >> >> +#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS >> if (static_cpu_has(X86_FEATURE_PKU) && >> - (kvm_read_cr4_bits(vcpu, X86_CR4_PKE) || >> - (vcpu->arch.xcr0 & XFEATURE_MASK_PKRU))) { >> + ((vcpu->arch.xcr0 & XFEATURE_MASK_PKRU) || >> + kvm_read_cr4_bits(vcpu, X86_CR4_PKE))) { >> vcpu->arch.pkru = rdpkru(); >> if (vcpu->arch.pkru != vcpu->arch.host_pkru) >> write_pkru(vcpu->arch.host_pkru); >> } >> +#endif /* CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS */ >> >> if (kvm_read_cr4_bits(vcpu, X86_CR4_OSXSAVE)) { >> >> -- >> 2.30.1 (Apple Git-130) >>