On Wed, Nov 27, 2024 at 05:33:32PM -0800, Sean Christopherson wrote: >Drop x86.c's local pre-computed cr4_reserved bits and instead fold KVM's >reserved bits into the guest's reserved bits. This fixes a bug where VMX's >set_cr4_guest_host_mask() fails to account for KVM-reserved bits when >deciding which bits can be passed through to the guest. In most cases, >letting the guest directly write reserved CR4 bits is ok, i.e. attempting >to set the bit(s) will still #GP, but not if a feature is available in >hardware but explicitly disabled by the host, e.g. if FSGSBASE support is >disabled via "nofsgsbase". > >Note, the extra overhead of computing host reserved bits every time >userspace sets guest CPUID is negligible. The feature bits that are >queried are packed nicely into a handful of words, and so checking and >setting each reserved bit costs in the neighborhood of ~5 cycles, i.e. the >total cost will be in the noise even if the number of checked CR4 bits >doubles over the next few years. In other words, x86 will run out of CR4 >bits long before the overhead becomes problematic. > >Note #2, __cr4_reserved_bits() starts from CR4_RESERVED_BITS, which is >why the existing __kvm_cpu_cap_has() processing doesn't explicitly OR in >CR4_RESERVED_BITS (and why the new code doesn't do so either). > >Fixes: 2ed41aa631fc ("KVM: VMX: Intercept guest reserved CR4 bits to inject #GP fault") >Reviewed-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx> >Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> Reviewed-by: Chao Gao <chao.gao@xxxxxxxxx>