Re: [PATCH v2 26/49] KVM: x86: Add a macro to init CPUID features that KVM emulates in software

Sean Christopherson <seanjc@xxxxxxxxxx> · Mon, 8 Jul 2024 15:30:57 -0700

On Thu, Jul 04, 2024, Maxim Levitsky wrote:
> On Fri, 2024-05-17 at 10:39 -0700, Sean Christopherson wrote:
> > Now that kvm_cpu_cap_init() is a macro with its own scope, add EMUL_F() to
> > OR-in features that KVM emulates in software, i.e. that don't depend on
> > the feature being available in hardware.  The contained scope
> > of kvm_cpu_cap_init() allows using a local variable to track the set of
> > emulated leaves, which in addition to avoiding confusing and/or
> > unnecessary variables, helps prevent misuse of EMUL_F().
> > 
> > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> > ---
> >  arch/x86/kvm/cpuid.c | 36 +++++++++++++++++++++---------------
> >  1 file changed, 21 insertions(+), 15 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > index 1064e4d68718..33e3e77de1b7 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -94,6 +94,16 @@ u32 xstate_required_size(u64 xstate_bv, bool compacted)
> >  	F(name);						\
> >  })
> >  
> > +/*
> > + * Emulated Feature - For features that KVM emulates in software irrespective
> > + * of host CPU/kernel support.
> > + */
> > +#define EMUL_F(name)						\
> > +({								\
> > +	kvm_cpu_cap_emulated |= F(name);			\
> > +	F(name);						\
> > +})
> 
> To me it feels more and more that this patch series doesn't go into the right
> direction.
> 
> How about we just abandon the whole concept of masks and instead just have a
> list of statements
> 
> Pretty much the opposite of the patch series I confess:

FWIW, I think it's actually largely the same code under the hood.  The code for
each concept/flavor ends up being very similar, it's mostly just handling the
bitwise-OR in the callers vs. in the helpers.

> #define CAP_PASSTHOUGH		0x01
> #define CAP_EMULATED		0x02
> #define CAP_AMD_ALIASED		0x04 // for AMD aliased features
> #define CAP_SCATTERED		0x08
> #define CAP_X86_64		0x10 // supported only on 64 bit hypervisors
> ...
> 
> 
> /* CPUID_1_ECX*/
> 
> 				/* TMA is not passed though because: xyz*/
> kvm_cpu_cap_init(TMA,           0);
> 
> kvm_cpu_cap_init(SSSE3,         CAP_PASSTHOUGH);
> 				/* CNXT_ID is not passed though because: xyz*/
> kvm_cpu_cap_init(CNXT_ID,       0);
> kvm_cpu_cap_init(RESERVED,      0);
> kvm_cpu_cap_init(FMA,           CAP_PASSTHOUGH);
> ...
> 				/* KVM always emulates TSC_ADJUST */
> kvm_cpu_cap_init(TSC_ADJUST,    CAP_EMULATED | CAP_SCATTERED);
> 
> ...
> 
> /* CPUID_D_1_EAX*/
> 				/* XFD is disabled on 32 bit systems because: xyz*/
> kvm_cpu_cap_init(XFD, 		CAP_PASSTHOUGH | CAP_X86_64)
> 
> 
> 'kvm_cpu_cap_init' can be a macro if needed to have the compile checks.
> 
> There are several advantages to this:
> 
> - more readability, plus if needed each statement can be amended with a comment.
> - No weird hacks in 'F*' macros, which additionally eventually evaluate into a bit,
>   which is confusing.
>   In fact no need to even have them at all.
> - No need to verify that bitmask belongs to a feature word.

Yes, but the downside is that there is no enforcement of features in a word being
bundled together.

> - Merge friendly - each capability has its own line.

That's almost entirely convention though.  Other than inertia, nothing is stopping
us from doing:

	kvm_cpu_cap_init(CPUID_12_EAX,
		SF(SGX1) |
		SF(SGX2) |
		SF(SGX_EDECCSSA)
	);

I don't see a clean way of avoiding the addition of " |" on the last existing
line, but in practice I highly doubt that will ever be a source of meaningful pain.

Same goes for the point about adding comments.  We could do that with either
approach, we just don't do so today.

> Disadvantages:
> 
> - Longer list - IMHO not a problem, since it is very easy to read / search
>   and can have as much comments as needed.
>   For example this is how the kernel lists the CPUID features and this list IMHO
>   is very manageable.

There's one big difference: KVM would need to have a line for every feature that
KVM _doesn't_ support.  For densely populated words, that's not a huge issue,
but it's problematic for sparsely populated words, e.g. CPUID_12_EAX would have
29 reserved/unsupport entries, which IMO ends up being a big net negative for
code readability and ongoing maintenance.

We could avoid that cost (and the danger of a missed bit) by collecting the set
of features that have been initialized for each word, and then masking off the
uninitialized/unsupported at the end.  But then we're back to the bitwise-OR and
mask logic.

And while I agree that having the F*() macros set state _and_ evaulate to a bit
is imperfect, it does have its advantages.  E.g. to avoid evaluating to a value,
we could have F() modify a local variable that is scoped to kvm_cpu_cap_init(),
a las kvm_cpu_cap_emulated.  But then we'd need explicit code and/or comments
to call out that VMM_F() and the like intentionally don't set kvm_cpu_cap_supported,
whereas evualating to a value is a relatively self-documenting "0;".

> - Slower - kvm_set_cpu_caps is called exactly once per KVM module load, thus
>   performance is the last thing I would care about in this function.
> 
> Another note about this patch: It is somewhat confusing that EMUL_F just
> forces a feature in kvm caps, regardless of CPU support, because KVM also has
> KVM_GET_EMULATED_CPUID and it has a different meaning.

Yeah, but IMO that's a problem with KVM_GET_EMULATED_CPUID being poorly defined.

> Users can easily confuse the EMUL_F for something that sets a feature bit in
> the KVM_GET_EMULATED_CPUID.

I'll see if I can find a good spot for a comment to try and convenient