On Thu, May 24, 2018 at 10:50:56AM +0100, Dave Martin wrote: > On Thu, May 24, 2018 at 10:33:50AM +0200, Christoffer Dall wrote: > > On Wed, May 23, 2018 at 04:03:37PM +0100, Dave Martin wrote: > > > On Wed, May 23, 2018 at 03:56:57PM +0100, Catalin Marinas wrote: > > > > On Wed, May 23, 2018 at 02:31:59PM +0100, Dave P Martin wrote: > > > > > On Wed, May 23, 2018 at 01:48:12PM +0200, Christoffer Dall wrote: > > > > > > On Tue, May 22, 2018 at 05:05:08PM +0100, Dave Martin wrote: > > > > > > > This is true by construction however: TIF_FOREIGN_FPSTATE is never > > > > > > > cleared except when returning to userspace or returning from a > > > > > > > signal: thus, for a true kernel thread no FPSIMD context is ever > > > > > > > loaded, TIF_FOREIGN_FPSTATE will remain set and no context will > > > > > > > ever be saved. > > > > > > > > > > > > I don't understand this construction proof; from looking at the patch > > > > > > below it is not obvious to me why fpsimd_thread_switch() can never have > > > > > > !wrong_task && !wrong_cpu and therefore clear TIF_FOREIGN_FPSTATE for a > > > > > > kernel thread? > > > > > > > > > > Looking at this again, I think it is poorly worded. This patch aims to > > > > > make it true by construction, but it isn't prior to the patch. > > > > > > > > > > I'm tempted to delete the paragraph: the assertion of both untrue and > > > > > not the best way to justify that this patch works. > > > > > > > > > > > > > > > How about: > > > > > > > > > > -8<- > > > > > > > > > > The context switch logic already isolates user threads from each other. > > > > > This, it is sufficient for isolating user threads from the kernel, > > > > s/This/Thus/ ? > > > > I don't understand what 'it' refers to here? > > > > > > > since the goal either way is to ensure that code executing in userspace > > > > > cannot see any FPSIMD state except its own. Thus, there is no special > > > > > property of kernel threads that we care about except that it is > > > > > pointless to save or load FPSIMD register state for them. > > > > Actually, I'm not really sure what this paragraph is getting at. > > Reading this again, I don't think the paragraph adds much useful. > > So I propose deleting that too. > > > > > > > > > > > At worst, the removal of all the kernel thread special cases by this > > > > > patch would thus spuriously load and save state for kernel threads when > > > > > unnecessary. > > > > > > > > > > But the context switch logic is already deliberately optimised to defer > > > > > reloads of the regs until ret_to_user (or sigreturn as a special case), > > > > > which kernel threads by definition never reach. > > > > > > > > > > ->8- > > > > > > > > The "at worst" paragraph makes it look like it could happen (at least > > > > until you reach the last paragraph). Maybe you can just say that > > > > wrong_task and wrong_cpu (with the fpsimd_cpu = NR_CPUS addition) are > > > > always true for kernel threads. You should probably mention this in a > > > > comment in the code as well. > > > > > > What if I just delete the second paragraph, and remove the "But" from > > > the start of the third, and append: > > > > > > "As a result, the wrong_task and wrong_cpu tests in > > > fpsimd_thread_switch() will always yield false for kernel threads." > > > > > > ...with a similar comment in the code? > > > > ...with a risk of being a bit over-pedantic and annoying, may I suggest > > the following complete commit text: > > > > ------8<------ > > Currently the FPSIMD handling code uses the condition task->mm == > > NULL as a hint that task has no FPSIMD register context. > > > > The ->mm check is only there to filter out tasks that cannot > > possibly have FPSIMD context loaded, for optimisation purposes. > > However, TIF_FOREIGN_FPSTATE must always be checked anyway before > > saving FPSIMD context back to memory. For this reason, the ->mm > > checks are not useful, providing that that TIF_FOREIGN_FPSTATE is > > maintained properly for kernel threads. > > > > FPSIMD context is never preserved for kernel threads across a context > > switch and therefore TIF_FOREIGN_FPSTATE should always be true for > > (This refactoring opens up the interesting possibility of making > kernel-mode NEON in task context preemptible for kernel threads so > that we actually do preserve state... but that's a discussion for > another day. There may be code around that relies on > kernel_neon_begin() disabling preemption for real.) > > > kernel threads. This is indeed the case, as the wrong_task and > > This suggests that TIF_FOREIGN_FPSTATE is always true for kernel > threads today. This is not quite because use_mm() can make mm non- > NULL. > I was suggesting that it's always true after this patch. > > wrong_cpu tests in fpsimd_thread_switch() will always yield false for > > kernel threads. > > ("false" -> "true". My bad.) > > > Further, the context switch logic is already deliberately optimised to > > defer reloads of the FPSIMD context until ret_to_user (or sigreturn as a > > special case), which kernel threads by definition never reach, and > > therefore this change introduces no additional work in the critical > > path. > > > > This patch removes the redundant checks and special-case code. > > ------8<------ > > Looking at my existing text, I rather reworded it like this. > Does this work any better for you? > > --8<-- > > Currently the FPSIMD handling code uses the condition task->mm == > NULL as a hint that task has no FPSIMD register context. > > The ->mm check is only there to filter out tasks that cannot > possibly have FPSIMD context loaded, for optimisation purposes. > Also, TIF_FOREIGN_FPSTATE must always be checked anyway before > saving FPSIMD context back to memory. For these reasons, the ->mm > checks are not useful, providing that TIF_FOREIGN_FPSTATE is > maintained in a consistent way for kernel threads. Consistent with what? Without more context or explanation, I'm not sure what the reader is to make of that. Do you not mean the TIF_FOREIGN_FPSTATE is always true for kernel threads? > > The context switch logic is already deliberately optimised to defer > reloads of the regs until ret_to_user (or sigreturn as a special > case), and save them only if they have been previously loaded. > Kernel threads by definition never reach these paths. As a result, I'm struggling with the "As a result," here. Is this because reloads of regs in ret_to_user (or sigreturn) are the only places that can make wrong_cpu or wrong_task be false? (I'm actually wanting to understand this, not just bikeshedding the commit message, as new corner cases keep coming up on this logic.) > the wrong_task and wrong_cpu tests in fpsimd_thread_switch() will > always yield true for kernel threads. > > This patch removes the redundant checks and special-case code, ensuring that TIF_FOREIGN_FPSTATE is set whenever a kernel thread is scheduled in, and ensures that this flag is set for the init > task. The fpsimd_flush_task_state() call already present in copy_thread() ensures the same for any new task. nit: funny formatting nit: ensuring that TIF_FOREIGN_FPSTATE *remains* set whenever a kernel thread is scheduled in? > > With TIF_FOREIGN_FPSTATE always set for kernel threads, this patch > ensures that no extra context save work is added for kernel > threads, and eliminates the redundant context saving that may > currently occur for kernel threads that have acquired an mm via > use_mm(). > > -->8-- If you can slightly connect the dots with the "As a result" above, I'm fine with your version of the text. Thanks, -Christoffer _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm