On Wed, Oct 30, 2024 at 07:58:36PM +0000, Mark Brown wrote: > We intend that signal handlers are entered with PSTATE.{SM,ZA}={0,0}. > The logic for this in setup_return() manipulates the saved state and > live CPU state in an unsafe manner, and consequently, when a task enters > a signal handler: Looking at this, I think there's a bigger question as to what we actually intend here; explanation below, with two possible answers at the end. [...] > +/* > + * Called by the signal handling code when preparing current to enter > + * a signal handler. Currently this only needs to take care of exiting > + * streaming mode and clearing ZA on SME systems. > + */ > +void fpsimd_enter_sighandler(void) > +{ > + if (!system_supports_sme()) > + return; > + > + get_cpu_fpsimd_context(); > + > + if (test_thread_flag(TIF_FOREIGN_FPSTATE)) { > + /* Exiting streaming mode zeros the FPSIMD state */ > + if (current->thread.svcr & SVCR_SM_MASK) { > + memset(¤t->thread.uw.fpsimd_state, 0, > + sizeof(current->thread.uw.fpsimd_state)); > + current->thread.fp_type = FP_STATE_FPSIMD; > + } > + > + current->thread.svcr &= ~(SVCR_ZA_MASK | > + SVCR_SM_MASK); > + > + /* Ensure any copies on other CPUs aren't reused */ > + fpsimd_flush_task_state(current); > + } else { > + /* The register state is current, just update it. */ > + sme_smstop(); > + } I don't think that the foreign / non-foreign cases are equivalent. In the foreign case we clear the entire fpsimd_state structure, i.e. all of: struct user_fpsimd_state { __uint128_t vregs[32]; __u32 fpsr; __u32 fpcr; __u32 __reserved[2]; }; Looking at the latest ARM ARM (ARM DDI 0487K.a): https://developer.arm.com/documentation/ddi0487/ka/ ... the descriptions for FPSR and FPCR say nothing about exiting streaming mode, and rule RKFRQZ says: | When the Effective value of PSTATE.SM is changed by any method from 1 | to 0, an exit from Streaming SVE mode is performed, and in the | newly-entered mode, all implemented bits of the SVE scalable vector | registers, SVE predicate registers, and FFR, are set to zero. ... which doesn't say anything about FPSR or FPCR, so from the ARM ARM it doesn't look like SMSTOP will clobber those. Looking at the latest "Arm A-profile Architecture Registers" document (ARM DDI 061 2024-09): https://developer.arm.com/documentation/ddi0601/2024-09/ ... the description of FPCR says nothing about exiting streaming mode, so it appears to be preserved. ... the description of FPMR (which is not in the latest ARM ARM) says: | On entry to or exit from Streaming SVE mode, FPMR is set to 0. ... so we'd need code to clobber that. ... and the description of FPSR says: | On entry to or exit from Streaming SVE mode, FPSR.{IOC, DZC, OFC, UFC, | IXC, IDC, QC} are set to 1 and the remaining bits are set to 0. ... so we'd need something more elaborate. AFAICT either: (a) Our intended ABI is that signal handlers are entered as-if an SMSTOP is executed to exit streaming mode and disable ZA storage. In this case we'll need a more elaborate sequence here to simulate that effect. (b) Our intended ABI is that signal handlers are entered with PSTATE.{SM,ZA} cleared, FPSR cleared, FPCR cleared, and FPMR preserved. In this case we cannot use SMSTOP in the non-foreign case, and it would be simplest to always save the value back to memory and manipulate it there. Our documentation in Documentation/arch/arm64/sme.rst says: | Signal handlers are invoked with streaming mode and ZA disabled. ... and doesn't mention FPCR/FPMR/FPSR, so we could go either way, though I suspect we intended case (a) ? Mark.