This patch series attempts to clarify the tracking of which set of floating point registers we save on systems supporting SVE, particularly with reference to KVM, and then uses the results of this clarification to improve the performance of simple syscalls where we return directly to userspace in cases where userspace is using SVE. At present we track which register state is active by using the TIF_SVE flag for the current task which also controls if userspace is able to use SVE, this is reasonably straightforward if limiting but for KVM it gets a bit hairy since we may have guest state loaded in registers. This results in KVM modifying TIF_SVE for the VMM task while the guest is running which doesn't entirely help make things easy to follow. To help make things clearer the series changes things so that in addition to TIF_SVE we explicitly track both the type of registers that are currently saved in the task struct and the type of registers that we should save when we do so. TIF_SVE then solely controls if userspace can use SVE without trapping, it has no function for KVM guests and we can remove the code for managing it from KVM. The refactoring to add the separate tracking is initially done by adding the new state together with checks that the state corresponds to expectations when we look at it before subsequent patches make use of the separated state, the goal being to both split out the more repetitive bits of tha change and make it easier to debug any problems that might arise. With the state tracked separately we then start to optimise the performance of syscalls when the process is using SVE. Currently every syscall disables SVE for userspace which means that we need to trap to EL1 again on the next SVE instruction, flush the SVE registers, and reenable SVE for EL0, creating overhead for tasks that mix SVE and syscalls. We build on the above refactoring to eliminate this overhead for simple syscalls which return directly to userspace by keeping SVE enabled unless we need to reload the state from memory, meaning that if syscalls do not block we avoid the overhead of trapping to EL1 again on next use of SVE. v3: - Rebase onto my series "arm64/sme: SME related fixes" since there is a direct dependency on the signal fix and testing is much easier with the bug fixes rolled in. - s/type/fp_type/ in struct fpsimd_last_state_struct. - Add comment about the V register storage being ignored when data is stored in SVE format. - Move dropping of special casing for FPSIMD register state in SME into a separate patch later in the series. - Simplify logic in task_fpsimd_load(). - Remove support for leaving the SVE state not shared with FPSIMD untouched, keep the unconditional flush. v2: - Rebase onto v5.19-rc3. - Don't warn when restoring streaming mode SVE without TIF_SVE. Mark Brown (7): KVM: arm64: Discard any SVE state when entering KVM guests arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE arm64/fpsimd: Have KVM explicitly say which FP registers to save arm64/fpsimd: Stop using TIF_SVE to manage register saving in KVM arm64/fpsimd: Load FP state based on recorded data type arm64/fpsimd: SME no longer requires SVE register state arm64/sve: Leave SVE enabled on syscall if we don't context switch arch/arm64/include/asm/fpsimd.h | 4 +- arch/arm64/include/asm/kvm_host.h | 1 + arch/arm64/include/asm/processor.h | 7 ++ arch/arm64/kernel/fpsimd.c | 137 +++++++++++++++++++++++------ arch/arm64/kernel/process.c | 2 + arch/arm64/kernel/ptrace.c | 5 +- arch/arm64/kernel/signal.c | 7 +- arch/arm64/kernel/syscall.c | 19 ++-- arch/arm64/kvm/fpsimd.c | 16 ++-- 9 files changed, 148 insertions(+), 50 deletions(-) base-commit: bb357a5e4232401e587da41329d8de5b42acd10e -- 2.30.2 _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm