On 13/10/21 12:14, Andy Lutomirski wrote:
I think it's simpler to always wait for #NM, it will only happen
once per vCPU. In other words, even if the guest clears XFD before
it generates #NM, the guest_fpu's XFD remains nonzero and an #NM
vmexit is possible. After #NM the guest_fpu's XFD is zero; then
passthrough can happen and the #NM vmexit trap can be disabled.
This will stop being at all optimal when Intel inevitably adds
another feature that uses XFD. In the potentially infinite window in
which the guest manages XFD and #NM on behalf of its userspace and
when the guest allocates the other hypothetical feature, all the #NMs
will have to be trapped by KVM.
The reason is that it's quite common to simply let the guest see all
CPUID bits that KVM knows about. But it's not unlikely that most guests
will not ever use any XFD feature, and therefore will not ever see an
#NM. I wouldn't have any problem with allocating _all_ of the dynamic
state space on the first #NM.
Thinking more about it, #NM only has to be trapped if XCR0 enables a
dynamic feature. In other words, the guest value of XFD can be limited
to (host_XFD|guest_XFD) & guest_XCR0. This avoids that KVM
unnecessarily traps for old guests that use CR0.TS.
Paolo
Is it really worthwhile for KVM to use XFD at all instead of
preallocating the state and being done with it? KVM would still have
to avoid data loss if the guest sets XFD with non-init state, but #NM
could always pass through.