On Wed, May 23, 2018 at 01:48:12PM +0200, Christoffer Dall wrote: > On Tue, May 22, 2018 at 05:05:08PM +0100, Dave Martin wrote: > > Currently the FPSIMD handling code uses the condition task->mm == > > NULL as a hint that task has no FPSIMD register context. > > > > The ->mm check is only there to filter out tasks that cannot > > possibly have FPSIMD context loaded, for optimisation purposes. > > Also, TIF_FOREIGN_FPSTATE must always be checked anyway before > > saving FPSIMD context back to memory. For these reasons, the ->mm > > checks are not useful, providing that that TIF_FOREIGN_FPSTATE is Hmmm, "that that". I'll fix that. > > maintained in a consistent way for kernel threads. > > > > This is true by construction however: TIF_FOREIGN_FPSTATE is never > > cleared except when returning to userspace or returning from a > > signal: thus, for a true kernel thread no FPSIMD context is ever > > loaded, TIF_FOREIGN_FPSTATE will remain set and no context will > > ever be saved. > > I don't understand this construction proof; from looking at the patch > below it is not obvious to me why fpsimd_thread_switch() can never have > !wrong_task && !wrong_cpu and therefore clear TIF_FOREIGN_FPSTATE for a > kernel thread? Looking at this again, I think it is poorly worded. This patch aims to make it true by construction, but it isn't prior to the patch. I'm tempted to delete the paragraph: the assertion of both untrue and not the best way to justify that this patch works. How about: -8<- The context switch logic already isolates user threads from each other. This, it is sufficient for isolating user threads from the kernel, since the goal either way is to ensure that code executing in userspace cannot see any FPSIMD state except its own. Thus, there is no special property of kernel threads that we care about except that it is pointless to save or load FPSIMD register state for them. At worst, the removal of all the kernel thread special cases by this patch would thus spuriously load and save state for kernel threads when unnecessary. But the context switch logic is already deliberately optimised to defer reloads of the regs until ret_to_user (or sigreturn as a special case), which kernel threads by definition never reach. ->8- As an aside, I notice that we allow thread.fpsimd_cpu to be initialised to 0 for the init task. This means that the wrong_cpu check may yield false for the init task when it shouldn't, because the init task's FPSIMD state (which doesn't logically exist) is never loaded anywhere. But the wrong_task check will always yield true for the init task for the same reason, so this is just an inconsistency in the code rather than a bug AFAICT. copy_thread() calls fpsimd_flush_task_state() to make sure that wrong_cpu is initially true for new tasks. We should do the same for the init task by adding .fpsimd_cpu = NR_CPUS, to INIT_THREAD. Cheers ---Dave > > > Thanks, > -Christoffer > > > > > This patch removes the redundant checks and special-case code. > > > > Signed-off-by: Dave Martin <Dave.Martin@xxxxxxx> > > Cc: Catalin Marinas <catalin.marinas@xxxxxxx> > > Cc: Will Deacon <will.deacon@xxxxxxx> > > Cc: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> > > > > --- > > > > Changes since v9: > > > > * New patch. Introduced during debugging, since the ->mm checks > > appear bogus and/or redundant, so are likely to be hiding or > > causing bugs. > > --- > > arch/arm64/include/asm/thread_info.h | 1 + > > arch/arm64/kernel/fpsimd.c | 38 ++++++++++++------------------------ > > 2 files changed, 14 insertions(+), 25 deletions(-) > > > > diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h > > index 740aa03c..a2ac914 100644 > > --- a/arch/arm64/include/asm/thread_info.h > > +++ b/arch/arm64/include/asm/thread_info.h > > @@ -47,6 +47,7 @@ struct thread_info { > > > > #define INIT_THREAD_INFO(tsk) \ > > { \ > > + .flags = _TIF_FOREIGN_FPSTATE, \ > > .preempt_count = INIT_PREEMPT_COUNT, \ > > .addr_limit = KERNEL_DS, \ > > } > > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c > > index 3aa100a..1222491 100644 > > --- a/arch/arm64/kernel/fpsimd.c > > +++ b/arch/arm64/kernel/fpsimd.c > > @@ -891,31 +891,21 @@ asmlinkage void do_fpsimd_exc(unsigned int esr, struct pt_regs *regs) > > > > void fpsimd_thread_switch(struct task_struct *next) > > { > > + bool wrong_task, wrong_cpu; > > + > > if (!system_supports_fpsimd()) > > return; > > - /* > > - * Save the current FPSIMD state to memory, but only if whatever is in > > - * the registers is in fact the most recent userland FPSIMD state of > > - * 'current'. > > - */ > > - if (current->mm) > > - fpsimd_save(); > > > > - if (next->mm) { > > - /* > > - * If we are switching to a task whose most recent userland > > - * FPSIMD state is already in the registers of *this* cpu, > > - * we can skip loading the state from memory. Otherwise, set > > - * the TIF_FOREIGN_FPSTATE flag so the state will be loaded > > - * upon the next return to userland. > > - */ > > - bool wrong_task = __this_cpu_read(fpsimd_last_state.st) != > > + /* Save unsaved fpsimd state, if any: */ > > + fpsimd_save(); > > + > > + /* Fix up TIF_FOREIGN_FPSTATE to correctly describe next's state: */ > > + wrong_task = __this_cpu_read(fpsimd_last_state.st) != > > &next->thread.uw.fpsimd_state; > > - bool wrong_cpu = next->thread.fpsimd_cpu != smp_processor_id(); > > + wrong_cpu = next->thread.fpsimd_cpu != smp_processor_id(); > > > > - update_tsk_thread_flag(next, TIF_FOREIGN_FPSTATE, > > - wrong_task || wrong_cpu); > > - } > > + update_tsk_thread_flag(next, TIF_FOREIGN_FPSTATE, > > + wrong_task || wrong_cpu); > > } > > > > void fpsimd_flush_thread(void) > > @@ -1120,9 +1110,8 @@ void kernel_neon_begin(void) > > > > __this_cpu_write(kernel_neon_busy, true); > > > > - /* Save unsaved task fpsimd state, if any: */ > > - if (current->mm) > > - fpsimd_save(); > > + /* Save unsaved fpsimd state, if any: */ > > + fpsimd_save(); > > > > /* Invalidate any task state remaining in the fpsimd regs: */ > > fpsimd_flush_cpu_state(); > > @@ -1244,8 +1233,7 @@ static int fpsimd_cpu_pm_notifier(struct notifier_block *self, > > { > > switch (cmd) { > > case CPU_PM_ENTER: > > - if (current->mm) > > - fpsimd_save(); > > + fpsimd_save(); > > fpsimd_flush_cpu_state(); > > break; > > case CPU_PM_EXIT: > > -- > > 2.1.4 [...] > _______________________________________________ > kvmarm mailing list > kvmarm@xxxxxxxxxxxxxxxxxxxxx > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm