On 7/20/2023 2:08 AM, Jim Mattson wrote:
Normally, we would restrict guest MSR writes based on guest CPU features. However, with IA32_SPEC_CTRL and IA32_PRED_CMD, this is not the case. For the first non-zero write to IA32_SPEC_CTRL, we check to see that the host supports the value written. We don't care whether or not the guest supports the value written (as long as it supports the MSR). After the first non-zero write, we stop intercepting writes to IA32_SPEC_CTRL, so the guest can write any value supported by the hardware. This could be problematic in heterogeneous migration pools. For instance, a VM that starts on a Cascade Lake host may set IA32_SPEC_CTRL.PSFD[bit 7], even if the guest CPUID.(EAX=07H,ECX=02H):EDX.PSFD[bit 0] is clear. Then, if that VM is migrated to a Skylake host, KVM_SET_MSRS will refuse to set IA32_SPEC_CTRL to its current value, because Skylake doesn't support PSFD. We disable write intercepts IA32_PRED_CMD as long as the guest supports the MSR. That's fine for now, since only one bit of PRED_CMD has been defined. Hence, guest support and host support are equivalent...today. But, are we really comfortable with letting the guest set any IA32_PRED_CMD bit that may be defined in the future?
>
The same question applies to IA32_SPEC_CTRL. Are we comfortable with letting the guest write to any bit that may be defined in the future?
My point is we need to fix it, though Chao has different point that sometimes performance may be more important[*]
[*] https://lore.kernel.org/all/ZGdE3jNS11wV+V2w@chao-email/
At least the AMD approach with V_SPEC_CTRL prevents the guest from clearing any bits set by the host, but on Intel, it's a total free-for-all. What happens when a new bit is defined that absolutely must be set to 1 all of the time?