Re: [PATCH v6 18/25] KVM: x86: Use KVM-governed feature framework to track "SHSTK/IBT enabled"

Maxim Levitsky <mlevitsk@xxxxxxxxxx> · Tue, 07 Nov 2023 20:05:51 +0200

On Fri, 2023-11-03 at 17:07 -0700, Sean Christopherson wrote:
> On Thu, Nov 02, 2023, Maxim Levitsky wrote:
> > On Wed, 2023-11-01 at 08:46 -0700, Sean Christopherson wrote:
> > > On Tue, Oct 31, 2023, Maxim Levitsky wrote:
> > > > On Thu, 2023-09-14 at 02:33 -0400, Yang Weijiang wrote:
> > > > > Use the governed feature framework to track whether X86_FEATURE_SHSTK
> > > > > and X86_FEATURE_IBT features can be used by userspace and guest, i.e.,
> > > > > the features can be used iff both KVM and guest CPUID can support them.
> > > > PS: IMHO The whole 'governed feature framework' is very confusing and
> > > > somewhat poorly documented.
> > > > 
> > > > Currently the only partial explanation of it, is at 'governed_features',
> > > > which doesn't explain how to use it.
> > > 
> > > To be honest, terrible name aside, I thought kvm_governed_feature_check_and_set()
> > > would be fairly self-explanatory, at least relative to all the other CPUID handling
> > > in KVM.
> > 
> > What is not self-explanatory is what are the governed_feature and how to query them.
> 
> ...
> 
> > > > However thinking again about the whole thing: 
> > > > 
> > > > IMHO the 'governed features' is another quite confusing term that a KVM
> > > > developer will need to learn and keep in memory.
> > > 
> > > I 100% agree, but I explicitly called out the terrible name in the v1 and v2
> > > cover letters[1][2], and the patches were on the list for 6 months before I
> > > applied them.  I'm definitely still open to a better name, but I'm also not
> > > exactly chomping at the bit to get behind the bikehsed.
> > 
> > Honestly I don't know if I can come up with a better name either.  Name is
> > IMHO not the underlying problem, its the feature itself that is confusing.
> 
> ...
> 
> > > Yes and no.  For "governed features", probably not.  But for CPUID as a whole, there
> > > are legimiate cases where userspace needs to enumerate things that aren't officially
> > > "supported" by KVM.  E.g. topology, core crystal frequency (CPUID 0x15), defeatures
> > > that KVM hasn't yet learned about, features that don't have virtualization controls
> > > and KVM hasn't yet learned about, etc.  And for things like Xen and Hyper-V paravirt
> > > features, it's very doable to implement features that are enumerate by CPUID fully
> > > in userspace, e.g. using MSR filters.
> > > 
> > > But again, it's a moot point because KVM has (mostly) allowed userspace to fully
> > > control guest CPUID for a very long time.
> > > 
> > > > Such a feature which is advertised as supported but not really working is a
> > > > recipe of hard to find guest bugs IMHO.
> > > > 
> > > > IMHO it would be much better to just check this condition and do
> > > > kvm_vm_bugged() or something in case when a feature is enabled in the guest
> > > > CPUID but KVM can't support it, and then just use guest CPUID in
> > > > 'guest_can_use()'.
> > 
> > OK, I won't argue that much over this, however I still think that there are
> > better ways to deal with it.
> > 
> > If we put optimizations aside (all of this can surely be optimized such as to
> > have very little overhead)
> > 
> > How about we have 2 cpuids: Guest visible CPUID which KVM will never use directly
> > other than during initialization and effective cpuid which is roughly
> > what governed features are, but will include all features and will be initialized
> > roughly like governed features are initialized:
> > 
> > effective_cpuid = guest_cpuid & kvm_supported_cpuid 
> > 
> > Except for some forced overrides like for XSAVES and such.
> > 
> > Then we won't need to maintain a list of governed features, and guest_can_use()
> > for all features will just return the effective cpuid leafs.
> > 
> > In other words, I want KVM to turn all known CPUID features to governed features,
> > and then remove all the mentions of governed features except 'guest_can_use'
> > which is a good API.
> > 
> > Such proposal will use a bit more memory but will make it easier for future
> > KVM developers to understand the code and have less chance of introducing bugs.
> 
> Hmm, two _full_ CPUID arrays would be a mess and completely unnecessary.  E.g.
> we'd have to sort out Hyper-V and KVM PV, which both have their own caches.  And
> a duplicate entry for things like F/M/S would be ridiculous.
> 
> But maintaining a per-vCPU version of the CPU caps is definitely doable.  I.e. a
> vCPU equivalent to kvm_cpu_caps and the per-CPU capabilities.  There are currently
> 25 leafs that are tracked by kvm_cpu_caps, so relative to "governed" features,
> the cost will be 96 bytes per vCPU.  I agree that 96 bytes is worth eating, we've
> certainly taken on more for a lot, lot less.
> 
> It's a lot of churn, and there are some subtle nasties, e.g. MWAIT and other
> CPUID bits that changed based on MSRs or CR4, but most of the churn is superficial
> and the result is waaaaay less ugly than governed features and for the majority of
> features will Just Work.
> 
> I'll get a series posted next week (need to write changelogs and do a _lot_ more
> testing).  If you want to take a peek at where I'm headed before then:
> 
>   https://github.com/sean-jc/linux x86/guest_cpufeatures

This looks very good, looking forward to see the patches on the mailing list.

Best regards,
	Maxim Levitsky

>