Re: [PATCH v10 10/28] x86/fpu/xstate: Update the XSTATE save function to support dynamic states

"Bae, Chang Seok" <chang.seok.bae@xxxxxxxxx> · Sat, 2 Oct 2021 22:54:25 +0000

On Oct 2, 2021, at 14:31, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> On Fri, Oct 01 2021 at 17:41, Thomas Gleixner wrote:
>> On Wed, Aug 25 2021 at 08:53, Chang S. Bae wrote:
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index 74dde635df40..7c46747f6865 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -9899,11 +9899,16 @@ static void kvm_save_current_fpu(struct fpu *fpu)
>>> 	 * KVM does not support dynamic user states yet. Assume the buffer
>>> 	 * always has the minimum size.
> 
> I have to come back to this because that assumption is just broken.
> 
> create_vcpu()
>   vcpu->user_fpu = alloc_default_fpu_size();
>   vcpu->guest_fpu = alloc_default_fpu_size();
> 
> vcpu_task()
>   get_amx_permission()
>   use_amx()
>     #NM
>     alloc_larger_state()
>   ...
>   kvm_arch_vcpu_ioctl_run()
>     kvm_arch_vcpu_ioctl_run()
>       kvm_load_guest_fpu()
>         kvm_save_current_fpu(vcpu->arch.user_fpu);
>           save_fpregs_to_fpstate(fpu);         <- Out of bounds write
> 
> Adding a comment that KVM does not yet support dynamic user states does
> not cut it, really.
> 
> Even if the above is unlikely, it is possible and has to be handled
> correctly at the point where AMX support is enabled in the kernel
> independent of guest support.

I see. 

> You have two options:
> 
>  1) Always allocate the large buffer size which is required to
>     accomodate all possible features.
> 
>     Trivial, but waste of memory.
> 
>  2) Make the allocation dynamic which seems to be trivial to do in
>     kvm_load_guest_fpu() at least for vcpu->user_fpu.
> 
>     The vcpu->guest_fpu handling can probably be postponed to the
>     point where AMX is actually exposed to guests, but it's probably
>     not the worst idea to think about the implications now.

FWIW, the proposed KVM patch for AMX looks to take (1) here [1] and
Paolo said [2]:

    Most guests will not need the whole xstate feature set.  So perhaps you 
    could set XFD to the host value | the guest value, trap #NM if the 
    host XFD is zero, and possibly reflect the exception to the guest's XFD 
    and XFD_ERR.

    In addition, loading the guest XFD MSRs should use the MSR autoload 
    feature (add_atomic_switch_msr).

And then I guess discussion goes on..

Thanks,
Chang

[1] https://lore.kernel.org/kvm/20210207154256.52850-4-jing2.liu@xxxxxxxxxxxxxxx/
[2] https://lore.kernel.org/kvm/ae5b0195-b04f-8eef-9e0d-2a46c761d2d5@xxxxxxxxxx/