On Thu, Feb 06, 2025 at 02:11:02PM +0000, Mark Rutland wrote: > In non-protected KVM modes, while the guest FPSIMD/SVE/SME state is live on the > CPU, the host's active SVE VL may differ from the guest's maximum SVE VL: > > * For VHE hosts, when a VM uses NV, ZCR_EL2 contains a value constrained > by the guest hypervisor, which may be less than or equal to that > guest's maximum VL. > > Note: in this case the value of ZCR_EL1 is immaterial due to E2H. > > * For nVHE/hVHE hosts, ZCR_EL1 contains a value written by the guest, > which may be less than or greater than the guest's maximum VL. > > Note: in this case hyp code traps host SVE usage and lazily restores > ZCR_EL2 to the host's maximum VL, which may be greater than the > guest's maximum VL. > > This can be the case between exiting a guest and kvm_arch_vcpu_put_fp(). > If a softirq is taken during this period and the softirq handler tries > to use kernel-mode NEON, then the kernel will fail to save the guest's > FPSIMD/SVE state, and will pend a SIGKILL for the current thread. > > This happens because kvm_arch_vcpu_ctxsync_fp() binds the guest's live > FPSIMD/SVE state with the guest's maximum SVE VL, and > fpsimd_save_user_state() verifies that the live SVE VL is as expected > before attempting to save the register state: > > | if (WARN_ON(sve_get_vl() != vl)) { > | force_signal_inject(SIGKILL, SI_KERNEL, 0, 0); > | return; > | } > > Fix this and make this a bit easier to reason about by always eagerly > switching ZCR_EL{1,2} at hyp during guest<->host transitions. With this > happening, there's no need to trap host SVE usage, and the nVHE/nVHVE nit: nVHVE? (also, note to Fuad: I think we're trapping FPSIMD/SVE from the host with pKVM in Android, so we'll want to fix that when we take this patch via -stable) > __deactivate_cptr_traps() logic can be simplified enable host access to nit: to enable > all present FPSIMD/SVE/SME features. > > In protected nVHE/hVHVE modes, the host's state is always saved/restored nit: hVHVE (something tells me these acronyms aren't particularly friendly!) > by hyp, and the guest's state is saved prior to exit to the host, so > from the host's PoV the guest never has live FPSIMD/SVE/SME state, and > the host's ZCR_EL1 is never clobbered by hyp. > > Fixes: 8c8010d69c132273 ("KVM: arm64: Save/restore SVE state for nVHE") > Fixes: 2e3cf82063a00ea0 ("KVM: arm64: nv: Ensure correct VL is loaded before saving SVE state") > Signed-off-by: Mark Rutland <mark.rutland@xxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > Cc: Catalin Marinas <catalin.marinas@xxxxxxx> > Cc: Fuad Tabba <tabba@xxxxxxxxxx> > Cc: Marc Zyngier <maz@xxxxxxxxxx> > Cc: Mark Brown <broonie@xxxxxxxxxx> > Cc: Oliver Upton <oliver.upton@xxxxxxxxx> > Cc: Will Deacon <will@xxxxxxxxxx> > --- > arch/arm64/kvm/fpsimd.c | 30 --------------- > arch/arm64/kvm/hyp/include/hyp/switch.h | 51 +++++++++++++++++++++++++ > arch/arm64/kvm/hyp/nvhe/hyp-main.c | 13 +++---- > arch/arm64/kvm/hyp/nvhe/switch.c | 6 +-- > arch/arm64/kvm/hyp/vhe/switch.c | 4 ++ > 5 files changed, 63 insertions(+), 41 deletions(-) [...] > diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h > index 163867f7f7c52..bbec7cd38da33 100644 > --- a/arch/arm64/kvm/hyp/include/hyp/switch.h > +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h > @@ -375,6 +375,57 @@ static inline void __hyp_sve_save_host(void) > true); > } > > +static inline void fpsimd_lazy_switch_to_guest(struct kvm_vcpu *vcpu) > +{ > + u64 zcr_el1, zcr_el2; > + > + if (!guest_owns_fp_regs()) > + return; > + > + if (vcpu_has_sve(vcpu)) { > + /* A guest hypervisor may restrict the effective max VL. */ > + if (vcpu_has_nv(vcpu) && !is_hyp_ctxt(vcpu)) > + zcr_el2 = __vcpu_sys_reg(vcpu, ZCR_EL2); > + else > + zcr_el2 = vcpu_sve_max_vq(vcpu) - 1; > + > + write_sysreg_el2(zcr_el2, SYS_ZCR); > + > + zcr_el1 = __vcpu_sys_reg(vcpu, vcpu_sve_zcr_elx(vcpu)); > + write_sysreg_el1(zcr_el1, SYS_ZCR); > + } > +} > + > +static inline void fpsimd_lazy_switch_to_host(struct kvm_vcpu *vcpu) > +{ > + u64 zcr_el1, zcr_el2; > + > + if (!guest_owns_fp_regs()) > + return; > + > + if (vcpu_has_sve(vcpu)) { > + zcr_el1 = read_sysreg_el1(SYS_ZCR); > + __vcpu_sys_reg(vcpu, vcpu_sve_zcr_elx(vcpu)) = zcr_el1; > + > + /* > + * The guest's state is always saved using the guest's max VL. > + * Ensure that the host has the guest's max VL active such that > + * the host can save the guest's state lazily, but don't > + * artificially restrict the host to the guest's max VL. > + */ > + if (has_vhe()) { > + zcr_el2 = vcpu_sve_max_vq(vcpu) - 1; > + write_sysreg_el2(zcr_el2, SYS_ZCR); > + } else { > + zcr_el2 = sve_vq_from_vl(kvm_host_sve_max_vl) - 1; > + write_sysreg_el2(zcr_el2, SYS_ZCR); > + > + zcr_el1 = vcpu_sve_max_vq(vcpu) - 1; > + write_sysreg_el1(zcr_el1, SYS_ZCR); Do we need an ISB before this to make sure that the CPTR traps have been deactivated properly? Will