On Wed, Mar 04, 2020 at 07:45:38AM -0800, Sean Christopherson wrote: > On Wed, Mar 04, 2020 at 11:18:15PM +0800, Yang Weijiang wrote: > > On Tue, Mar 03, 2020 at 02:28:27PM -0800, Sean Christopherson wrote: > > > > @@ -1886,6 +1976,26 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > > > > else > > > > msr_info->data = vmx->pt_desc.guest.addr_a[index / 2]; > > > > break; > > > > + case MSR_IA32_S_CET: > > > > + if (!cet_ctl_access_allowed(vcpu, msr_info)) > > > > + return 1; > > > > + msr_info->data = vmcs_readl(GUEST_S_CET); > > > > + break; > > > > + case MSR_IA32_INT_SSP_TAB: > > > > + if (!cet_ssp_access_allowed(vcpu, msr_info)) > > > > + return 1; > > > > + msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE); > > > > + break; > > > > + case MSR_IA32_U_CET: > > > > + if (!cet_ctl_access_allowed(vcpu, msr_info)) > > > > + return 1; > > > > + rdmsrl(MSR_IA32_U_CET, msr_info->data); > > > > + break; > > > > + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: > > > > + if (!cet_ssp_access_allowed(vcpu, msr_info)) > > > > + return 1; > > > > + rdmsrl(msr_info->index, msr_info->data); > > > > > > Ugh, thought of another problem. If a SoftIRQ runs after an IRQ it can > > > load the kernel FPU state. So for all the XSAVES MSRs we'll need a helper > > > similar to vmx_write_guest_kernel_gs_base(), except XSAVES has to be even > > > more restrictive and disable IRQs entirely. E.g. > > > > > > static void vmx_get_xsave_msr(struct msr_data *msr_info) > > > { > > > local_irq_disable(); > > > if (test_thread_flag(TIF_NEED_FPU_LOAD)) > > > switch_fpu_return(); > > > rdmsrl(msr_info->index, msr_info->data); > > > local_irq_enable(); > > In this case, would SoftIRQ destroy vcpu->arch.guest.fpu states which > > had been restored to XSAVES MSRs that we were accessing? > > Doing kernel_fpu_begin() from a softirq would swap guest.fpu out of the > CPUs registers. It sets TIF_NEED_FPU_LOAD to mark the tasks has needing to > reload its FPU state prior to returning to userspace. So it doesn't > destroy it per se. The result is that KVM would read/write the CET MSRs > after they're loaded from the kernel's FPU state instead of reading the > MSRs loaded from the guest's FPU state. > OK, will wrap the access code with a helper, thank you! > > So should we restore > > guest.fpu or? In previous patch, we have restored guest.fpu before > > access the XSAVES MSRs. > > There are three different FPU states: > > - kernel > - userspace > - guest > > RDMSR/WRMSR for CET MSRs need to run while the guest.fpu state is loaded > into the CPU registers[1]. At the beginning of the syscall from userspace, > i.e. the vCPU ioctl(), the task's FPU state[2] holds userspace FPU state. > Patch 6/7 swaps out the userspace state and loads the guest state. > > But, if a softirq runs between kvm_load_guest_fpu() and now, and executes > kernel_fpu_begin(), it will swap the guest state (out of CPU registers) > and load the kernel state (into PCU registers). The actual RDMSR/WRMSR > needs to ensure the guest state is still loaded by checking and handling > TIF_NEED_FPU_LOAD. > > [1] An alternative to doing switch_fpu_return() on TIF_NEED_FPU_LOAD would > be to calculate the offset into the xsave and read/write directly > to/from memory. But IMO that's unnecessary complexity as the guest's > fpu state still needs to be reloaded before re-entering the guest, e.g. > if vmx_{g,s}et_msr() is invoked on {RD,WR}MSR intercept, while loading > or saving MSR state from userspace isn't a hot path. > > [2] I worded this to say "task's FPU state" because it's also possible the > CPU registers hold kernel state at the beginning of the vCPU ioctl(), > e.g. because of softirq. It's clear to me, thanks for the explanation.