Hi, Paolo/Thomas, Any comment on following opens? In general doing static buffer expansion definitely simplifies the logic, but still need your help to finalize its impact on the overall design. 😊 Thanks Kevin > From: Tian, Kevin > Sent: Thursday, December 16, 2021 1:36 PM > > > From: Tian, Kevin > > Sent: Thursday, December 16, 2021 9:00 AM > > > > > > Off-list, Thomas mentioned doing it even at vCPU creation as long as the > > > prctl has been called. That is also okay and even simpler. > > > > Make sense. It also avoids the #GP thing in the emulation path if just due > > to reallocation error. > > > > We'll follow this direction to do a quick update/test. > > > > After some study there are three opens which we'd like to sync here. Once > they are closed we'll send out a new version very soon (hopefully tomorrow). > > 1) Have a full expanded buffer at vCPU creation > > There are two options. > > One is to directly allocate a big-enough buffer upon guest_fpu::perm in > fpu_alloc_guest_fpstate(). There is no reallocation per-se thus most changes > in this series are not required. > > The other is to keep the reallocation concept (thus all previous patches are > kept) and still call a wrapper around __xfd_enable_feature() even at vCPU > creation (e.g. right after fpu_init_guest_permissions()). This matches the > fpu core assumption that fpstate for xfd features are dynamically allocated, > though the actual calling point may not be dynamic. This also allows us > to exploit doing expansion in KVM_SET_CPUID2 (see next). > > 2) Do expansion at vCPU creation or KVM_ SET_CPUID2? > > If the reallocation concept is still kept, then we feel doing expansion in > KVM_SET_CPUID2 makes slightly more sense. There is no functional > difference between two options since the guest is not running at this > point. And in general Qemu should set prctl according to the cpuid bits. > But since anyway we still need to check guest cpuid against guest perm in > KVM_SET_CPUID2, it reads clearer to expand the buffer only after this > check is passed. > > If this approach is agreed, then we may replace the helper functions in > this patch with a new one: > > /* > * fpu_update_guest_perm_features - Enable xfeatures according to guest > perm > * @guest_fpu: Pointer to the guest FPU container > * > * Enable all dynamic xfeatures according to guest perm. Invoked if the > * caller wants to conservatively expand fpstate buffer instead of waiting > * until a given feature is accessed. > * > * Return: 0 on success, error code otherwise > */ > +int fpu_update_guest_perm_features(struct fpu_guest *guest_fpu) > +{ > + u64 expand; > + > + lockdep_assert_preemption_enabled(); > + > + if (!IS_ENABLED(CONFIG_X86_64)) > + return 0; > + > + expand = guest_fpu->perm & ~guest_fpu->xfeatures; > + if (!expand) > + return 0; > + > + return __xfd_enable_feature(expand, guest_fpu); > +} > +EXPORT_SYMBOL_GPL(fpu_update_guest_features); > > 3) Always disable interception of disable after 1st interception? > > Once we choose to have a full expanded buffer before guest runs, the > point of intercepting WRMSR(IA32_XFD) becomes less obvious since > no dynamic reallocation is required. > > One option is to always disable WRMSR interception once > KVM_SET_CPUID2 succeeds, with the cost of one RDMSR per vm-exit. > But doing so affects legacy OS which even has no XFD logic at all. > > The other option is to continue the current policy i.e. disable write > emulation only after the 1st interception of setting XFD to a non-zero > value. Then the RDMSR cost is added only for guest which supports XFD. > > In either case we need another helper to update guest_fpu->fpstate->xfd > as required in the restore path. For the 2nd option we further want > to update MSR if guest_fpu is currently in use: > > +void xfd_update_guest_xfd(struct fpu_guest *guest_fpu, u64 xfd) > +{ > + fpregs_lock(); > + guest_fpu->fpstae->xfd = xfd; > + if (guest_fpu->fpstate->in_use) > + xfd_update_state(guest_fpu->fpstate); > + fpregs_unlock(); > +} > > Thoughts? > -- > p.s. currently we're working on a quick prototype based on: > - Expand buffer in KVM_SET_CPUID2 > - Disable write emulation after first interception > to check any oversight. > > Thanks > Kevin