On Thu, Jan 04, 2024, Sean Christopherson wrote: > On Thu, Dec 21, 2023, Maxim Levitsky wrote: > > On Thu, 2023-11-30 at 17:51 -0800, Sean Christopherson wrote: > > > On Sun, Nov 19, 2023, Maxim Levitsky wrote: > > > > On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote: > > > > Also why not to initialize guest_caps = host_caps & userspace_cpuid? > > > > > > > > If this was the default we won't need any guest_cpu_cap_restrict and such, > > > > instead it will just work. > > > > > > Hrm, I definitely like the idea. Unfortunately, unless we do an audit of all > > > ~120 uses of guest_cpuid_has(), restricting those based on kvm_cpu_caps might > > > break userspace. > > > > 120 uses is not that bad, IMHO it is worth it - we won't need to deal with that > > in the future. > > > > How about a compromise - you change the patches such as it will be possible > > to remove these cases one by one, and also all new cases will be fully > > automatic? > > Hrm, I'm not necessarily opposed to that, but I don't think we should go partway > unless we are 100% confident that changing the default to "use guest CPUID ANDed > with KVM capabilities" is the best end state, *and* that someone will actually > have the bandwidth to do the work soon-ish so that KVM isn't in a half-baked > state for months on end. Even then, my preference would definitely be to switch > everything in one go. > > And automatically handling new features would only be feasible for entirely new > leafs. E.g. X86_FEATURE_RDPID is buried in CPUID.0x7.0x0.ECX, so to automatically > handle new features KVM would need to set the default guest_caps for all bits > *except* RDPID, at which point we're again building a set of features that need > to opt-out. > > > > To be fair, the manual lists predate the governed features. > > > > 100% agree, however the point of governed features was to simplify this list, > > the point of this patch set is to simplify these lists and yet they still remain, > > more or less untouched, and we will still need to maintain them. > > > > Again I do think that governed features and/or this patchset are better than > > the mess that was there before, but a part of me wants to fully get rid of > > this mess instead of just making it a bit more beautiful. > > Oh, I would love to get rid of the mess too, I _completely_ getting rid of the > mess isn't realistic. There are guaranteed to be exceptions to the rule, whether > the rule is "use guest CPUID by default" or "use guest CPUID constrained by KVM > capabilities by default". > > I.e. there will always be some amount of manual messiness, the question is which > default behavior would yield the smallest mess. My gut agrees with you, that > defaulting to "guest & KVM" would yield the fewest exceptions. But as above, > I think we're better off doing the switch as an all-or-nothing things (where "all" > means within a single series, not within a single patch). Ok, the idea of having vcpu->arch.cpu_caps default to a KVM & GUEST is growing on me. There's a lurking bug in KVM that in some ways is due to lack of a per-vCPU, KVM-enforced set of a features. The bug is relatively benign (VMX passes through CR4.FSGSBASE when it's not supported in the host), and easy to fix (incorporate KVM-reserved CR4 bits into vcpu->arch.cr4_guest_rsvd_bits), but it really is something that just shouldn't happen. E.g. KVM's handling of EFER has a similar lurking problem where __kvm_valid_efer() is unsafe to use without also consulting efer_reserved_bits. And after digging a bit more, I think I'm just being overly paranoid. I'm fairly certain the only exceptions are literally the few that I've called out (RDPID, MOVBE, and MWAIT (which is only a problem because of a stupid quirk)). I don't yet have a firm plan on how to deal with the exceptions in a clean way, e.g. I'd like to somehow have the "opt-out" code share the set of emulated features with __do_cpuid_func_emulated(). One thought would be to add kvm_emulated_cpu_caps, which would be *comically* wasteful, but might be worth the 90 bytes. For v2, what if I post this more or less as-is, with a "convert to KVM & GUEST" patch thrown in at the end as an RFC? I want to do a lot more testing (and staring) before committing to the conversion, and sadly I don't have anywhere near enough cycles to do that right now.