On Wed, Sep 13, 2017 at 07:33:25AM -0700, Catalin Marinas wrote: > On Thu, Aug 31, 2017 at 06:00:43PM +0100, Dave P Martin wrote: > > +/* > > + * Handle SVE state across fork(): > > + * > > + * dst and src must not end up with aliases of the same sve_state. > > + * Because a task cannot fork except in a syscall, we can discard SVE > > + * state for dst here, so long as we take care to retain the FPSIMD > > + * subset of the state if SVE is in use. Reallocation of the SVE state > > + * will be deferred until dst tries to use SVE. > > + */ > > +void fpsimd_dup_sve(struct task_struct *dst, struct task_struct const *src) > > +{ > > + if (test_and_clear_tsk_thread_flag(dst, TIF_SVE)) { > > + WARN_ON(dst->mm && !in_syscall(task_pt_regs(dst))); > > + sve_to_fpsimd(dst); > > + } > > + > > + dst->thread.sve_state = NULL; > > +} > > I first thought the thread flags are not visible in dst yet since > dup_task_struct() calls arch_dup_task_struct() before > setup_thread_stack(). However, at the end of the last year we enabled > CONFIG_THREAD_INFO_IN_TASK_STRUCT. But I don't particularly like relying > on this. Hmmm, I see your point, but there are some sequencing issues here. > Anyway, IIUC we don't need sve_to_fpsimd() here. The > arch_dup_task_struct() already called fpsimd_preserve_current_state() I consider SVE discard as an optional side effect of task_fpsimd_save(), not something that is guaranteed to happen -- the decision about whether to do so may become more intelligent later on. So, for src, we may discard SVE (because syscall), but for dst we must NULL .sve_state (and therefore clear TIF_SVE) simply to avoid aliasing of src->sve_state and dst->sve_state. The latter requires operating on the thread_flags of dst. I'll need to check whether there's another suitable hook for updating the thread flags of dst, if we aren't confident that they will always have been initialised by the time arch_dup_task_struct() is called. Either way, there would be an intra-thread ordering requirement between the task_struct and thread_info updates here, _if_ dst were schedulable: dst:TIF_SVE must be cleared before dst->sve_state is NULLed. dst is not schedulable until fork is done though, so maybe this doesn't really matter... > for src, so the FPSIMD state (which we care about) is transferred during > the *dst = *src assignment. So you'd only need the last statement, > possibly with a different function name like fpsimd_erase_sve (and maybe > make the function static inline in the header). Not quite: TIF_SVE must be cleared so that a context switch or kernel_neon_begin() after dst is scheduled doesn't try to save state in the (NULL) dst->sve_state. > > [...] > > int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) > > @@ -246,6 +247,9 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) > > if (current->mm) > > fpsimd_preserve_current_state(); > > *dst = *src; > > + > > + fpsimd_dup_sve(dst, src); > > + > > return 0; > > } Cheers ---Dave