On Wed, Oct 13, 2021, at 5:26 AM, Paolo Bonzini wrote: > On 13/10/21 12:14, Andy Lutomirski wrote: >>> I think it's simpler to always wait for #NM, it will only happen >>> once per vCPU. In other words, even if the guest clears XFD before >>> it generates #NM, the guest_fpu's XFD remains nonzero and an #NM >>> vmexit is possible. After #NM the guest_fpu's XFD is zero; then >>> passthrough can happen and the #NM vmexit trap can be disabled. >> >> This will stop being at all optimal when Intel inevitably adds >> another feature that uses XFD. In the potentially infinite window in >> which the guest manages XFD and #NM on behalf of its userspace and >> when the guest allocates the other hypothetical feature, all the #NMs >> will have to be trapped by KVM. > > The reason is that it's quite common to simply let the guest see all > CPUID bits that KVM knows about. But it's not unlikely that most guests > will not ever use any XFD feature, and therefore will not ever see an > #NM. I wouldn't have any problem with allocating _all_ of the dynamic > state space on the first #NM. > > Thinking more about it, #NM only has to be trapped if XCR0 enables a > dynamic feature. In other words, the guest value of XFD can be limited > to (host_XFD|guest_XFD) & guest_XCR0. This avoids that KVM > unnecessarily traps for old guests that use CR0.TS. > You could simplify this by allocating the state the first time XCR0 enables the feature in question. (This is how regular non-virt userspace *should* work too, but it looks like I’ve probably been outvoted on that front…) > Paolo > >> Is it really worthwhile for KVM to use XFD at all instead of >> preallocating the state and being done with it? KVM would still have >> to avoid data loss if the guest sets XFD with non-init state, but #NM >> could always pass through. >>