On Mon, Oct 30, 2023, Xiaoyao Li wrote: > On 10/25/2023 10:22 PM, Sean Christopherson wrote: > > On Wed, Oct 25, 2023, Vitaly Kuznetsov wrote: > > > Xiaoyao Li <xiaoyao.li@xxxxxxxxx> writes: > > > > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c > > > > index b8ab9ee5896c..388a3fdd3cad 100644 > > > > --- a/arch/x86/kernel/kvm.c > > > > +++ b/arch/x86/kernel/kvm.c > > > > @@ -65,6 +65,7 @@ static int __init parse_no_stealacc(char *arg) > > > > early_param("no-steal-acc", parse_no_stealacc); > > > > +static DEFINE_PER_CPU_READ_MOSTLY(bool, async_pf_enabled); > > > > > > Would it make a difference is we replace this with a cpumask? I realize > > > that we need to access it on all CPUs from hotpaths but this mask will > > > rarely change so maybe there's no real perfomance hit? > > > > FWIW, I personally prefer per-CPU booleans from a readability perspective. I > > doubt there is a meaningful performance difference for a bitmap vs. individual > > booleans, the check is already gated by a static key, i.e. kernels that are NOT > > running as KVM guests don't care. > > I agree with it. > > > Actually, if there's performance gains to be had, optimizing kvm_read_and_reset_apf_flags() > > to read the "enabled" flag if and only if it's necessary is a more likely candidate. > > Assuming the host isn't being malicious/stupid, then apf_reason.flags will be '0' > > if PV async #PFs are disabled. The only question is whether or not apf_reason.flags > > is predictable enough for the CPU. > > > > Aha! In practice, the CPU already needs to resolve a branch based on apf_reason.flags, > > it's just "hidden" up in __kvm_handle_async_pf(). > > > > If we really want to micro-optimize, provide an __always_inline inner helper so > > that __kvm_handle_async_pf() doesn't need to make a CALL just to read the flags. > > Then in the common case where a #PF isn't due to the host swapping out a page, > > the paravirt happy path doesn't need a taken branch and never reads the enabled > > variable. E.g. the below generates: > > If this is wanted. It can be a separate patch, irrelevant with this series, > I think. Yes, it's definitely beyond the scope of this series.