On Fri, Mar 07, 2025 at 11:09:42AM -0800, Dave Hansen wrote: >On 3/7/25 08:41, Chao Gao wrote: >> case |IA32_XSS[12] | Space | RFBM[12] | Drop% >> -----+-------------+-------+----------+------ >> 1 | 0 | None | 0 | 0.0% >> 2 | 1 | None | 0 | 0.2% >> 3 | 1 | 24B? | 1 | 0.2% > >So, 0.2% is still, what, dozens of cycles? Are you sure that it really >takes the CPU dozens of cycles to skip over the feature during XSAVE? > >If it really turns out to be this measurable, we should probably follow >up with the folks that implement XSAVE and see what's going on under the >covers. I reran the performance tests and observed a run-to-run variation of 0.4% to 0.7%. So, I don't think there is any measurable performance difference. I will update the performance statements in the cover letter. > >On a separate note, I was bugging Thomas a bit on IRC. His memory was >that the AMX-era FPU rework only expected KVM to support user features. >You might want to dig through the history a bit and see if _that_ was >ever properly addressed because that would change the problem you're >trying to solve. I went through the email discussions and found only one relevant thread: https://lore.kernel.org/kvm/87wnmf66m5.ffs@tglx/#t where Thomas mentioned that guest_perm would be set as follows: guest_fpu::__state_perm = supported_xcr0 & xstate_get_group_perm(); If implemented this way, KVM would only support user features. However, the committed change is: 980fe2fddcff ("x86/fpu: Extend fpu_xstate_prctl() with guest permissions") In this change, fpu->guest_perm is copied from fpu->perm: + /* Same defaults for guests */ + fpu->guest_perm = fpu->perm; There are indeed some issues with enabling supervisor features for guest FPUs, but they have been addressed by recent changes in the tip-tree ([1], [2]) and patch 1 of this series. [1]: https://lore.kernel.org/all/20250218141045.85201-1-stanspas@xxxxxxxxx/ [2]: https://lore.kernel.org/all/20250317140613.1761633-1-chao.gao@xxxxxxxxx/