Re: [PATCH v10 07/18] arm64: fpsimd: Eliminate task->mm checks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 24, 2018 at 10:50:56AM +0100, Dave Martin wrote:
> On Thu, May 24, 2018 at 10:33:50AM +0200, Christoffer Dall wrote:
> > On Wed, May 23, 2018 at 04:03:37PM +0100, Dave Martin wrote:
> > > On Wed, May 23, 2018 at 03:56:57PM +0100, Catalin Marinas wrote:
> > > > On Wed, May 23, 2018 at 02:31:59PM +0100, Dave P Martin wrote:
> > > > > On Wed, May 23, 2018 at 01:48:12PM +0200, Christoffer Dall wrote:
> > > > > > On Tue, May 22, 2018 at 05:05:08PM +0100, Dave Martin wrote:
> > > > > > > This is true by construction however: TIF_FOREIGN_FPSTATE is never
> > > > > > > cleared except when returning to userspace or returning from a
> > > > > > > signal: thus, for a true kernel thread no FPSIMD context is ever
> > > > > > > loaded, TIF_FOREIGN_FPSTATE will remain set and no context will
> > > > > > > ever be saved.
> > > > > > 
> > > > > > I don't understand this construction proof; from looking at the patch
> > > > > > below it is not obvious to me why fpsimd_thread_switch() can never have
> > > > > > !wrong_task && !wrong_cpu and therefore clear TIF_FOREIGN_FPSTATE for a
> > > > > > kernel thread?
> > > > > 
> > > > > Looking at this again, I think it is poorly worded.  This patch aims to
> > > > > make it true by construction, but it isn't prior to the patch.
> > > > > 
> > > > > I'm tempted to delete the paragraph: the assertion of both untrue and
> > > > > not the best way to justify that this patch works.
> > > > > 
> > > > > 
> > > > > How about:
> > > > > 
> > > > > -8<-
> > > > > 
> > > > > The context switch logic already isolates user threads from each other.
> > > > > This, it is sufficient for isolating user threads from the kernel,
> > 
> > s/This/Thus/ ?
> > 
> > I don't understand what 'it' refers to here?
> > 
> > > > > since the goal either way is to ensure that code executing in userspace
> > > > > cannot see any FPSIMD state except its own.  Thus, there is no special
> > > > > property of kernel threads that we care about except that it is
> > > > > pointless to save or load FPSIMD register state for them.
> > 
> > Actually, I'm not really sure what this paragraph is getting at.
> 
> Reading this again, I don't think the paragraph adds much useful.
> 
> So I propose deleting that too.
> 
> > > > > 
> > > > > At worst, the removal of all the kernel thread special cases by this
> > > > > patch would thus spuriously load and save state for kernel threads when
> > > > > unnecessary.
> > > > > 
> > > > > But the context switch logic is already deliberately optimised to defer
> > > > > reloads of the regs until ret_to_user (or sigreturn as a special case),
> > > > > which kernel threads by definition never reach.
> > > > > 
> > > > > ->8-
> > > > 
> > > > The "at worst" paragraph makes it look like it could happen (at least
> > > > until you reach the last paragraph). Maybe you can just say that
> > > > wrong_task and wrong_cpu (with the fpsimd_cpu = NR_CPUS addition) are
> > > > always true for kernel threads. You should probably mention this in a
> > > > comment in the code as well.
> > > 
> > > What if I just delete the second paragraph, and remove the "But" from
> > > the start of the third, and append:
> > > 
> > > "As a result, the wrong_task and wrong_cpu tests in
> > > fpsimd_thread_switch() will always yield false for kernel threads."
> > > 
> > > ...with a similar comment in the code?
> > 
> > ...with a risk of being a bit over-pedantic and annoying, may I suggest
> > the following complete commit text:
> > 
> > ------8<------
> > Currently the FPSIMD handling code uses the condition task->mm ==
> > NULL as a hint that task has no FPSIMD register context.
> > 
> > The ->mm check is only there to filter out tasks that cannot
> > possibly have FPSIMD context loaded, for optimisation purposes.
> > However, TIF_FOREIGN_FPSTATE must always be checked anyway before
> > saving FPSIMD context back to memory.  For this reason, the ->mm
> > checks are not useful, providing that that TIF_FOREIGN_FPSTATE is
> > maintained properly for kernel threads.
> > 
> > FPSIMD context is never preserved for kernel threads across a context
> > switch and therefore TIF_FOREIGN_FPSTATE should always be true for
> 
> (This refactoring opens up the interesting possibility of making
> kernel-mode NEON in task context preemptible for kernel threads so
> that we actually do preserve state... but that's a discussion for
> another day.  There may be code around that relies on
> kernel_neon_begin() disabling preemption for real.)
> 
> > kernel threads.  This is indeed the case, as the wrong_task and
> 
> This suggests that TIF_FOREIGN_FPSTATE is always true for kernel
> threads today.  This is not quite because use_mm() can make mm non-
> NULL.
> 

I was suggesting that it's always true after this patch.

> > wrong_cpu tests in fpsimd_thread_switch() will always yield false for
> > kernel threads.
> 
> ("false" -> "true".  My bad.)
> 
> > Further, the context switch logic is already deliberately optimised to
> > defer reloads of the FPSIMD context until ret_to_user (or sigreturn as a
> > special case), which kernel threads by definition never reach, and
> > therefore this change introduces no additional work in the critical
> > path.
> > 
> > This patch removes the redundant checks and special-case code.
> > ------8<------
> 
> Looking at my existing text, I rather reworded it like this.
> Does this work any better for you?
> 
> --8<--
> 
> Currently the FPSIMD handling code uses the condition task->mm ==
> NULL as a hint that task has no FPSIMD register context.
> 
> The ->mm check is only there to filter out tasks that cannot
> possibly have FPSIMD context loaded, for optimisation purposes.
> Also, TIF_FOREIGN_FPSTATE must always be checked anyway before
> saving FPSIMD context back to memory.  For these reasons, the ->mm
> checks are not useful, providing that TIF_FOREIGN_FPSTATE is
> maintained in a consistent way for kernel threads.

Consistent with what?  Without more context or explanation,
I'm not sure what the reader is to make of that.  Do you not mean the
TIF_FOREIGN_FPSTATE is always true for kernel threads?

> 
> The context switch logic is already deliberately optimised to defer
> reloads of the regs until ret_to_user (or sigreturn as a special
> case), and save them only if they have been previously loaded.
> Kernel threads by definition never reach these paths.  As a result,

I'm struggling with the "As a result," here.  Is this because reloads of
regs in ret_to_user (or sigreturn) are the only places that can make
wrong_cpu or wrong_task be false?

(I'm actually wanting to understand this, not just bikeshedding the
commit message, as new corner cases keep coming up on this logic.)

> the wrong_task and wrong_cpu tests in fpsimd_thread_switch() will
> always yield true for kernel threads.
> 
> This patch removes the redundant checks and special-case code,                  ensuring that TIF_FOREIGN_FPSTATE is set whenever a kernel thread               is scheduled in, and ensures that this flag is set for the init
> task.  The fpsimd_flush_task_state() call already present in                    copy_thread() ensures the same for any new task.

nit: funny formatting

nit: ensuring that TIF_FOREIGN_FPSTATE *remains* set whenever a kernel
thread is scheduled in?

> 
> With TIF_FOREIGN_FPSTATE always set for kernel threads, this patch
> ensures that no extra context save work is added for kernel
> threads, and eliminates the redundant context saving that may
> currently occur for kernel threads that have acquired an mm via
> use_mm().
> 
> -->8--

If you can slightly connect the dots with the "As a result" above, I'm
fine with your version of the text.

Thanks,
-Christoffer
_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm



[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux