Hi Thomas, On 10/15/2021 5:36 PM, Thomas Gleixner wrote: > Paolo, > > On Thu, Oct 14 2021 at 21:14, Thomas Gleixner wrote: > > On Thu, Oct 14 2021 at 17:01, Paolo Bonzini wrote: > >>> vcpu_create() > >>> > >>> fpu_init_fpstate_user(guest_fpu, supported_xcr0) > >>> > >>> That will (it does not today) do: > >>> > >>> guest_fpu::__state_perm = supported_xcr0 & > >>> xstate_get_group_perm(); > >>> > >>> The you have the information you need right in the guest FPU. > >> > >> Good, I wasn't aware of the APIs that will be there. > > > > Me neither, but that's a pretty obvious consequence of the work I'm > > doing for AMX. So I made it up for you. :) > > let me make some more up for you! > > If you carefully look at part 2 of the rework, then you might notice that there > is a fundamental change which allows to do a real simplification for KVM FPU > handling: > > current->thread.fpu.fpstate > > is now a pointer. So you can spare one FPU allocation because we can now > do: Trying to understand your point, seems struct fpu will add a guest_fpstate pointer. And this will be allocated when vcpu_create() by the following function. Swap between the two pointers in load/put. What I was thinking is that vcpu keeps having guest_fpu and delete user_fpu. > > fpu_attach_guest_fpu(supported_xcr0) > { > guest_fpstate = alloc_fpstate(supported_xcr0); I supposed this is called when vcpu_create(). Not sure the reason of supported_xcr0 input here. supported_xcr0[n]=1 and guest _state_perm[n]=1 which is requested before vcpu_create(), so this will allocate full buffer, at vcpu_create() stage? Or do you mean vcpu->arch.guest_supported_xcr0. Please correct me if I misunderstood. Thanks. > fpu_init_fpstate_user(guest_fpstate, supported_xcr0); > current->thread.fpu.guest_fpstate = guest_fpstate; } > > fpu_swap_kvm_fpu() becomes in the first step: > > fpu_swap_kvm_fpu(bool enter_guest) > { > safe_fpregs_to_fpstate(current->thread.fpu.fpstate); > > swap(current->thread.fpu.fpstate, current->thread.fpu.guest_fpstate); > > restore_fpregs_from_fpstate(current->thread.fpu.fpstate); > } > > @enter guest will allow to do some sanity checks > > In a second step: > > fpu_swap_kvm_fpu(bool enter_guest, u64 guest_needs_features) { > possibly_reallocate(enter_guest, guest_needs_features); When KVM traps guest wrmsr XFD in #NM, I think KVM need allocate fpstate buffer for full features. Because in next vmexit, guest might have dynamic state and KVM can be preempted before running fpu_swap_kvm_fpu(). Thus, here the current->thread.fpu.fpstate already has enough space for saving guest. Thanks, Jing > safe_fpregs_to_fpstate(current->thread.fpu.fpstate); > > swap(current->thread.fpu.fpstate, current->thread.fpu.guest_fpstate); > > restore_fpregs_from_fpstate(current->thread.fpu.fpstate); > possibly_reallocate(enter_guest, guest_needs_features); } > > @guest_needs_features is the information which you gather via guest XCR0 > and guest XFD. > > So fpu_swap_kvm_fpu() is going to be the place where reallocation happens > and that's good enough for both cases: > > vcpu_run() > > fpu_swap_kvm_fpu(); <- 1 > > while (...) > vmenter(); > > fpu_swap_kvm_fpu(); <- 2 > > #1 QEMU user space used feature and has already large fpstate > > #2 Guest requires feature but has not used it yet (XCR0/XFD trapping) > > See? > > It's not only correct, it's also simple and truly beautiful. > > Thanks, > > tglx