On Wednesday, December 29, 2021 8:39 AM, Sean Christopherson wrote: > To: Liu, Jing2 <jing2.liu@xxxxxxxxx> > Cc: x86@xxxxxxxxxx; kvm@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; > linux-doc@xxxxxxxxxxxxxxx; linux-kselftest@xxxxxxxxxxxxxxx; tglx@xxxxxxxxxxxxx; > mingo@xxxxxxxxxx; bp@xxxxxxxxx; dave.hansen@xxxxxxxxxxxxxxx; > pbonzini@xxxxxxxxxx; corbet@xxxxxxx; shuah@xxxxxxxxxx; Nakajima, Jun > <jun.nakajima@xxxxxxxxx>; Tian, Kevin <kevin.tian@xxxxxxxxx>; > jing2.liu@xxxxxxxxxxxxxxx; Zeng, Guang <guang.zeng@xxxxxxxxx>; Wang, Wei > W <wei.w.wang@xxxxxxxxx>; Zhong, Yang <yang.zhong@xxxxxxxxx> > Subject: Re: [PATCH v3 19/22] kvm: x86: Get/set expanded xstate buffer > > Shortlog needs to have a verb somewhere. > > On Wed, Dec 22, 2021, Jing Liu wrote: > > From: Guang Zeng <guang.zeng@xxxxxxxxx> > > > > When AMX is enabled it requires a larger xstate buffer than the legacy > > hardcoded 4KB one. Exising kvm ioctls > > Existing > > > (KVM_[G|S]ET_XSAVE under KVM_CAP_XSAVE) are not suitable for this > > purpose. > > ... > > > Reuse KVM_SET_XSAVE for both old/new formats by reimplementing it to > > do properly-sized memdup_user() based on the guest fpu container. > > I'm confused, the first sentence says KVM_SET_XSAVE isn't suitable, the > second says it can be reused with minimal effort. Probably "doesn't support" sounds better than "isn't suitable" above. But plan to reword a bit: With KVM_CAP_XSAVE, userspace uses a hardcoded 4KB buffer to get/set xstate data from/to KVM. This doesn't work when dynamic features (e.g. AMX) are used by the guest, as KVM uses a full expanded xstate buffer for the guest fpu emulation, which is larger than 4KB. Add KVM_CAP_XSAVE2, and userspace gets the required xstate buffer size from KVM via KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2). KVM_SET_XSAVE is extended with the support to work with larger xstate data size passed from userspace. KVM_GET_XSAVE2 is preferred to extending KVM_GET_XSAVE to work with large buffer size for backward-compatible considerations. (Link: https://lkml.org/lkml/2021/12/15/510) Also, update the api doc with the new KVM_GET_XSAVE2 ioctl. > > > Also, update the api doc with the new KVM_GET_XSAVE2 ioctl. > > ... > > > @@ -5367,7 +5382,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp, > > break; > > } > > case KVM_SET_XSAVE: { > > - u.xsave = memdup_user(argp, sizeof(*u.xsave)); > > + int size = vcpu->arch.guest_fpu.uabi_size; > > IIUC, reusing KVM_SET_XSAVE works by requiring that userspace use > KVM_GET_XSAVE2 if userspace has expanded the guest FPU size by exposing > relevant features to the guest via guest CPUID. If so, then that needs to be > enforced in KVM_GET_XSAVE, otherwise userspace will get subtle corruption > by invoking the wrong ioctl, e.g. > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index > 2c9606380bca..5d2acbd52df5 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -5386,6 +5386,10 @@ long kvm_arch_vcpu_ioctl(struct file *filp, > break; > } > case KVM_GET_XSAVE: { > + r -EINVAL; > + if (vcpu->arch.guest_fpu.uabi_size > sizeof(struct > kvm_xsave)) > + break; > + Looks good to me. Thanks, Wei