On 26/05/2017 06:13, Nick Desaulniers wrote: > On Thu, May 25, 2017 at 04:07:08PM +0200, Paolo Bonzini wrote: >> I think we should do the fixup backwards. >> >> That is: >> >> - first do get_fpu >> >> - if the fixup is necessary, i.e. ctxt->mode < X86EMUL_MODE_PROT64, do >> fxsave into &fxstate. >> >> - then do segmented_read_std with the correct size, which is >> - offsetof(struct fxregs_state, xmm_space[16]), i.e. 416 >> if ctxt->mode == X86EMUL_MODE_PROT64 >> - offsetof(struct fxregs_state, xmm_space[8]), i.e. 288 >> if ctxt->mode < X86EMUL_MODE_PROT64 and CR4.OSFXSR=1 >> - offsetof(struct fxregs_state, xmm_space[0]), i.e. 160 >> if ctxt->mode < X86EMUL_MODE_PROT64 and CR4.OSFXSR=0 > > but we still want to do a segmented_read_std with size 512 otherwise, > correct? No, 512 is never necessary. ctxt->mode is never > X86EMUL_MODE_PROT64 (see the definition of enum x86emul_mode in arch/x86/include/asm/kvm_emulate.h. >> - then check fx_state.mxcsr >> >> - then do fxrstor > > This sounds like we conditionally do the fxsave, but then always do the > fxrstor. Is that ok? I guess the original code kind of does that as > well. Correct. They idea is that fxrstor on Linux always accesses 416 bytes, but we may have to restore only the first part for accurate emulation. The fxsave retrieves the current state so that we can leave it unmodified when we do fxrstor. >> - finally do put_fpu > > Sounds straight forward. I can see how fxsave and CR4.OSFXSR are > accessed in fxstor_fixup. Is it ok to skip those memcpy's that would > otherwise occur when calling fxrstor_fixup() (which after these changes, > we would not be)? Yes, the memcpys are replaced with a shorter segmented_read_std. Thanks, Paolo