On Tue, Oct 4, 2022 at 5:08 PM Dong, Eddie <eddie.dong@xxxxxxxxx> wrote: > > > Hardware reserved CPUID bits are always zero today, though that may not be > > architecturally specified. > > entry->edx is initialized to native value in do_host_cpuid(), which executes physical CPUID. > I guess I am disconnected here. Hardware values should only be passed through for features that KVM can support. Reserved bits should be set to 0, because KVM has no idea whether or not it will be able to support them once they are defined. Perhaps an example will help. At one time, leaf 7 was completely reserved. Following the principle that KVM should not pass through reserved CPUID bits, KVM zeroed out this leaf prior to commit 611c120f7486 ("KVM: Mask function7 ebx against host capability word9"). Suppose that the legacy KVM had, as you suggest, passed through the hardware values for leaf 7. As CPUs appeared with SMEP, SMAP, Intel Processor Trace, SGX, and a whole slew of other features, that version of KVM would claim that it supported those features. Not true. How would userspace be able to tell a version of KVM that could really support SMEP from one that just blindly passed the bit through without knowing what it meant? The KVM_GET_SUPPORTED_CPUID results would be identical. In some cases, if KVM claims to support a feature that it doesn't (like SMEP), a guest that tries to use the feature will fail to boot (e.g. setting CR4.SMEP will raise an unexpected #GP). However, as you alluded to earlier, zeroing out reserved bits does not always work out. Again, looking at leaf 7, the old KVM that clears all of leaf 7 claims legacy x87 behavior with respect to the FPU data pointer, FPU CS and FPU DS values, even on newer chips where that is not true. This is because of the two "reverse polarity" feature bits in leaf 7, where '0' indicates the presence of the feature and '1' indicates that the feature has been removed. At least, in this case, userspace can tell if KVM is wrong, just by querying CPUID leaf 7 itself. Long after leaf 7 support was added to KVM, it continued to make the mistake of clearing those two bits. That bug wasn't addressed until commit e3bcfda012ed ("KVM: x86: Report deprecated x87 features in supported CPUID"). Fortunately, no software actually looks at those two bits. The KVM_GET_SUPPORTED_CPUID API is abysmal, but it is what we have for now. The best thing we can do is to zero out reserved bits. Passing through the hardware values is likely to get us into trouble in the future, when those bits are defined to mean something that we don't support.