On Tue, Dec 07 2021 at 19:03, Yang Zhong wrote: > diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c > index 5089f2e7dc22..9811dc98d550 100644 > --- a/arch/x86/kernel/fpu/core.c > +++ b/arch/x86/kernel/fpu/core.c > @@ -238,6 +238,7 @@ bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu) > fpstate->is_guest = true; > > gfpu->fpstate = fpstate; > + gfpu->xfd_err = XFD_ERR_GUEST_DISABLED; This wants to be part of the previous patch, which introduces the field. > gfpu->user_xfeatures = fpu_user_cfg.default_features; > gfpu->user_perm = fpu_user_cfg.default_features; > fpu_init_guest_permissions(gfpu); > @@ -297,6 +298,7 @@ int fpu_swap_kvm_fpstate(struct fpu_guest *guest_fpu, bool enter_guest) > fpu->fpstate = guest_fps; > guest_fps->in_use = true; > } else { > + fpu_save_guest_xfd_err(guest_fpu); Hmm. See below. > guest_fps->in_use = false; > fpu->fpstate = fpu->__task_fpstate; > fpu->__task_fpstate = NULL; > @@ -4550,6 +4550,9 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) > kvm_steal_time_set_preempted(vcpu); > srcu_read_unlock(&vcpu->kvm->srcu, idx); > > + if (vcpu->preempted) > + fpu_save_guest_xfd_err(&vcpu->arch.guest_fpu); I'm not really exited about the thought of an exception cause register in guest clobbered state. Aside of that I really have to ask the question why all this is needed? #NM in the guest is slow path, right? So why are you trying to optimize for it? The straight forward solution to this is: 1) Trap #NM and MSR_XFD_ERR write 2) When the guest triggers #NM is takes an VMEXIT and the host does: rdmsrl(MSR_XFD_ERR, vcpu->arch.guest_fpu.xfd_err); injects the #NM and goes on. 3) When the guest writes to MSR_XFD_ERR it takes an VMEXIT and the host does: vcpu->arch.guest_fpu.xfd_err = msrval; wrmsrl(MSR_XFD_ERR, msrval); and goes back. 4) Before entering the preemption disabled section of the VCPU loop do: if (vcpu->arch.guest_fpu.xfd_err) wrmsrl(MSR_XFD_ERR, vcpu->arch.guest_fpu.xfd_err); 5) Before leaving the preemption disabled section of the VCPU loop do: if (vcpu->arch.guest_fpu.xfd_err) wrmsrl(MSR_XFD_ERR, 0); It's really that simple and pretty much 0 overhead for the regular case. If the guest triggers #NM with a high frequency then taking the VMEXITs is the least of the problems. That's not a realistic use case, really. Hmm? Thanks, tglx