Re: [PATCH 8/8] KVM: arm64: Eagerly switch ZCR_EL{1,2}

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 04 Feb 2025 15:21:00 +0000,
Mark Rutland <mark.rutland@xxxxxxx> wrote:
> 
> In non-protected KVM modes, while the guest FPSIMD/SVE/SME state is live on the
> CPU, the host's active SVE VL may differ from the guest's maximum SVE VL:
> 
> * For VHE hosts, when a VM uses NV, ZCR_EL2 contains a value constrained
>   by the guest hypervisor, which may be less than or equal to that
>   guest's maximum VL.
> 
>   Note: in this case the value of ZCR_EL1 is immaterial due to E2H.
> 
> * For nVHE/hVHE hosts, ZCR_EL1 contains a value written by the guest,
>   which may be less than or greater than the guest's maximum VL.
> 
>   Note: in this case hyp code traps host SVE usage and lazily restores
>   ZCR_EL2 to the host's maximum VL, which may be greater than the
>   guest's maximum VL.
> 
> This can be the case between exiting a guest and kvm_arch_vcpu_put_fp().
> If a softirq is taken during this period and the softirq handler tries
> to use kernel-mode NEON, then the kernel will fail to save the guest's
> FPSIMD/SVE state, and will pend a SIGKILL for the current thread.
> 
> This happens because kvm_arch_vcpu_ctxsync_fp() binds the guest's live
> FPSIMD/SVE state with the guest's maximum SVE VL, and
> fpsimd_save_user_state() verifies that the live SVE VL is as expected
> before attempting to save the register state:
> 
> | if (WARN_ON(sve_get_vl() != vl)) {
> |         force_signal_inject(SIGKILL, SI_KERNEL, 0, 0);
> |         return;
> | }
> 
> Fix this and make this a bit easier to reason about by always eagerly
> switching ZCR_EL{1,2} at hyp during guest<->host transitions. With this
> happening, there's no need to trap host SVE usage, and the nVHE/nVHVE
> __deactivate_cptr_traps() logic can be simplified enable host access to
> all present FPSIMD/SVE/SME features.
> 
> In protected nVHE/hVHVE modes, the host's state is always saved/restored
> by hyp, and the guest's state is saved prior to exit to the host, so
> from the host's PoV the guest never has live FPSIMD/SVE/SME state, and
> the host's ZCR_EL1 is never clobbered by hyp.
> 
> Fixes: 8c8010d69c132273 ("KVM: arm64: Save/restore SVE state for nVHE")
> Fixes: 2e3cf82063a00ea0 ("KVM: arm64: nv: Ensure correct VL is loaded before saving SVE state")
> Signed-off-by: Mark Rutland <mark.rutland@xxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> Cc: Fuad Tabba <tabba@xxxxxxxxxx>
> Cc: Marc Zyngier <maz@xxxxxxxxxx>
> Cc: Mark Brown <broonie@xxxxxxxxxx>
> Cc: Oliver Upton <oliver.upton@xxxxxxxxx>
> Cc: Will Deacon <will@xxxxxxxxxx>
> ---
>  arch/arm64/kvm/fpsimd.c                 | 30 ---------------
>  arch/arm64/kvm/hyp/include/hyp/switch.h | 51 +++++++++++++++++++++++++
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c      | 13 +++----
>  arch/arm64/kvm/hyp/nvhe/switch.c        |  6 +--
>  arch/arm64/kvm/hyp/vhe/switch.c         |  4 ++
>  5 files changed, 63 insertions(+), 41 deletions(-)
> 
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index f64724197958e..3cbb999419af7 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -136,36 +136,6 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
>  	local_irq_save(flags);
>  
>  	if (guest_owns_fp_regs()) {
> -		if (vcpu_has_sve(vcpu)) {
> -			u64 zcr = read_sysreg_el1(SYS_ZCR);
> -
> -			/*
> -			 * If the vCPU is in the hyp context then ZCR_EL1 is
> -			 * loaded with its vEL2 counterpart.
> -			 */
> -			__vcpu_sys_reg(vcpu, vcpu_sve_zcr_elx(vcpu)) = zcr;
> -
> -			/*
> -			 * Restore the VL that was saved when bound to the CPU,
> -			 * which is the maximum VL for the guest. Because the
> -			 * layout of the data when saving the sve state depends
> -			 * on the VL, we need to use a consistent (i.e., the
> -			 * maximum) VL.
> -			 * Note that this means that at guest exit ZCR_EL1 is
> -			 * not necessarily the same as on guest entry.
> -			 *
> -			 * ZCR_EL2 holds the guest hypervisor's VL when running
> -			 * a nested guest, which could be smaller than the
> -			 * max for the vCPU. Similar to above, we first need to
> -			 * switch to a VL consistent with the layout of the
> -			 * vCPU's SVE state. KVM support for NV implies VHE, so
> -			 * using the ZCR_EL1 alias is safe.
> -			 */
> -			if (!has_vhe() || (vcpu_has_nv(vcpu) && !is_hyp_ctxt(vcpu)))
> -				sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1,
> -						       SYS_ZCR_EL1);
> -		}
> -
>  		/*
>  		 * Flush (save and invalidate) the fpsimd/sve state so that if
>  		 * the host tries to use fpsimd/sve, it's not using stale data
> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> index 3e34ade7e675a..0149a1f6fe0a0 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> @@ -375,6 +375,57 @@ static inline void __hyp_sve_save_host(void)
>  			 true);
>  }
>  
> +static inline void fpsimd_lazy_switch_to_guest(struct kvm_vcpu *vcpu)
> +{
> +	u64 zcr_el1, zcr_el2;
> +
> +	if (!guest_owns_fp_regs())
> +		return;
> +
> +	if (vcpu_has_sve(vcpu)) {
> +		/* A guest hypervisor may restrict the effective max VL. */
> +		if (vcpu_has_nv(vcpu) && !is_hyp_ctxt(vcpu))
> +			zcr_el2 = __vcpu_sys_reg(vcpu, ZCR_EL2);
> +		else
> +			zcr_el2 = vcpu_sve_max_vq(vcpu) - 1;
> +
> +		sve_cond_update_zcr_vq(zcr_el2, SYS_ZCR_EL2);

Not a big deal, but I though I'd mention it here: Using ZCR_EL2 (or
any other register using the _EL2 suffix) is a source of expensive
traps with NV. We're much better off using the _EL1 accessor if we are
running VHE, as this will involve no trap at all.

nVHE will of course trap, but using nVHE with SVE under NV is not
something I'm prepared to give a damn about.

> +
> +		zcr_el1 = __vcpu_sys_reg(vcpu, vcpu_sve_zcr_elx(vcpu));
> +		write_sysreg_el1(zcr_el1, SYS_ZCR);
> +	}
> +}
> +
> +static inline void fpsimd_lazy_switch_to_host(struct kvm_vcpu *vcpu)
> +{
> +	u64 zcr_el1, zcr_el2;
> +
> +	if (!guest_owns_fp_regs())
> +		return;
> +
> +	if (vcpu_has_sve(vcpu)) {
> +		zcr_el1 = read_sysreg_el1(SYS_ZCR);
> +		__vcpu_sys_reg(vcpu, vcpu_sve_zcr_elx(vcpu)) = zcr_el1;
> +
> +		/*
> +		 * The guest's state is always saved using the guest's max VL.
> +		 * Ensure that the host has the guest's max VL active such that
> +		 * the host can save the guest's state lazily, but don't
> +		 * artificially restrict the host to the guest's max VL.
> +		 */
> +		if (has_vhe()) {
> +			zcr_el2 = vcpu_sve_max_vq(vcpu) - 1;
> +			sve_cond_update_zcr_vq(zcr_el2, SYS_ZCR_EL2);

Same thing here.

No need to respin on this account. We can always add another patch on
top if there is nothing else to amend (still browsing...).

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux