On Fri, 2023-11-03 at 17:07 -0700, Sean Christopherson wrote: > On Thu, Nov 02, 2023, Maxim Levitsky wrote: > > On Wed, 2023-11-01 at 08:46 -0700, Sean Christopherson wrote: > > > On Tue, Oct 31, 2023, Maxim Levitsky wrote: > > > > On Thu, 2023-09-14 at 02:33 -0400, Yang Weijiang wrote: > > > > > Use the governed feature framework to track whether X86_FEATURE_SHSTK > > > > > and X86_FEATURE_IBT features can be used by userspace and guest, i.e., > > > > > the features can be used iff both KVM and guest CPUID can support them. > > > > PS: IMHO The whole 'governed feature framework' is very confusing and > > > > somewhat poorly documented. > > > > > > > > Currently the only partial explanation of it, is at 'governed_features', > > > > which doesn't explain how to use it. > > > > > > To be honest, terrible name aside, I thought kvm_governed_feature_check_and_set() > > > would be fairly self-explanatory, at least relative to all the other CPUID handling > > > in KVM. > > > > What is not self-explanatory is what are the governed_feature and how to query them. > > ... > > > > > However thinking again about the whole thing: > > > > > > > > IMHO the 'governed features' is another quite confusing term that a KVM > > > > developer will need to learn and keep in memory. > > > > > > I 100% agree, but I explicitly called out the terrible name in the v1 and v2 > > > cover letters[1][2], and the patches were on the list for 6 months before I > > > applied them. I'm definitely still open to a better name, but I'm also not > > > exactly chomping at the bit to get behind the bikehsed. > > > > Honestly I don't know if I can come up with a better name either. Name is > > IMHO not the underlying problem, its the feature itself that is confusing. > > ... > > > > Yes and no. For "governed features", probably not. But for CPUID as a whole, there > > > are legimiate cases where userspace needs to enumerate things that aren't officially > > > "supported" by KVM. E.g. topology, core crystal frequency (CPUID 0x15), defeatures > > > that KVM hasn't yet learned about, features that don't have virtualization controls > > > and KVM hasn't yet learned about, etc. And for things like Xen and Hyper-V paravirt > > > features, it's very doable to implement features that are enumerate by CPUID fully > > > in userspace, e.g. using MSR filters. > > > > > > But again, it's a moot point because KVM has (mostly) allowed userspace to fully > > > control guest CPUID for a very long time. > > > > > > > Such a feature which is advertised as supported but not really working is a > > > > recipe of hard to find guest bugs IMHO. > > > > > > > > IMHO it would be much better to just check this condition and do > > > > kvm_vm_bugged() or something in case when a feature is enabled in the guest > > > > CPUID but KVM can't support it, and then just use guest CPUID in > > > > 'guest_can_use()'. > > > > OK, I won't argue that much over this, however I still think that there are > > better ways to deal with it. > > > > If we put optimizations aside (all of this can surely be optimized such as to > > have very little overhead) > > > > How about we have 2 cpuids: Guest visible CPUID which KVM will never use directly > > other than during initialization and effective cpuid which is roughly > > what governed features are, but will include all features and will be initialized > > roughly like governed features are initialized: > > > > effective_cpuid = guest_cpuid & kvm_supported_cpuid > > > > Except for some forced overrides like for XSAVES and such. > > > > Then we won't need to maintain a list of governed features, and guest_can_use() > > for all features will just return the effective cpuid leafs. > > > > In other words, I want KVM to turn all known CPUID features to governed features, > > and then remove all the mentions of governed features except 'guest_can_use' > > which is a good API. > > > > Such proposal will use a bit more memory but will make it easier for future > > KVM developers to understand the code and have less chance of introducing bugs. > > Hmm, two _full_ CPUID arrays would be a mess and completely unnecessary. E.g. > we'd have to sort out Hyper-V and KVM PV, which both have their own caches. And > a duplicate entry for things like F/M/S would be ridiculous. > > But maintaining a per-vCPU version of the CPU caps is definitely doable. I.e. a > vCPU equivalent to kvm_cpu_caps and the per-CPU capabilities. There are currently > 25 leafs that are tracked by kvm_cpu_caps, so relative to "governed" features, > the cost will be 96 bytes per vCPU. I agree that 96 bytes is worth eating, we've > certainly taken on more for a lot, lot less. > > It's a lot of churn, and there are some subtle nasties, e.g. MWAIT and other > CPUID bits that changed based on MSRs or CR4, but most of the churn is superficial > and the result is waaaaay less ugly than governed features and for the majority of > features will Just Work. > > I'll get a series posted next week (need to write changelogs and do a _lot_ more > testing). If you want to take a peek at where I'm headed before then: > > https://github.com/sean-jc/linux x86/guest_cpufeatures This looks very good, looking forward to see the patches on the mailing list. Best regards, Maxim Levitsky >