On Wed, Dec 06, 2017 at 07:43:37PM +0000, Ard Biesheuvel wrote: > Add support macros to conditionally yield the NEON (and thus the CPU) > that may be called from the assembler code. > > In some cases, yielding the NEON involves saving and restoring a non > trivial amount of context (especially in the CRC folding algorithms), > and so the macro is split into three, and the code in between is only > executed when the yield path is taken, allowing the context to be preserved. > The third macro takes an optional label argument that marks the resume > path after a yield has been performed. > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> > --- > arch/arm64/include/asm/assembler.h | 51 ++++++++++++++++++++ > 1 file changed, 51 insertions(+) > > diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h > index 5f61487e9f93..c54e408fd5a7 100644 > --- a/arch/arm64/include/asm/assembler.h > +++ b/arch/arm64/include/asm/assembler.h > @@ -572,4 +572,55 @@ alternative_else_nop_endif > #endif > .endm > > +/* > + * Check whether to yield to another runnable task from kernel mode NEON code > + * (which runs with preemption disabled). > + * > + * if_will_cond_yield_neon > + * // pre-yield patchup code > + * do_cond_yield_neon > + * // post-yield patchup code > + * endif_yield_neon ^ Mention the lbl argument? > + * > + * - Check whether the preempt count is exactly 1, in which case disabling enabling ^ > + * preemption once will make the task preemptible. If this is not the case, > + * yielding is pointless. > + * - Check whether TIF_NEED_RESCHED is set, and if so, disable and re-enable > + * kernel mode NEON (which will trigger a reschedule), and branch to the > + * yield fixup code. Mention that neither patchup sequence is allowed to use section-changing directives? For example: if_will_cond_yield_neon // some code .pushsection .rodata, "a" foo: .quad // some literal data for some reason .popsection // some code do_cond_yield_neon is not safe, because .previous is now .rodata. (You could protect against this with .pushsection .text; .previous; .subsection 1; // ... .popsection but it may be overkill.) > + * > + * This macro sequence clobbers x0, x1 and the flags register unconditionally, > + * and may clobber x2 .. x18 if the yield path is taken. > + */ > + > + .macro cond_yield_neon, lbl > + if_will_cond_yield_neon > + do_cond_yield_neon > + endif_yield_neon \lbl > + .endm > + > + .macro if_will_cond_yield_neon > +#ifdef CONFIG_PREEMPT > + get_thread_info x0 > + ldr w1, [x0, #TSK_TI_PREEMPT] > + ldr x0, [x0, #TSK_TI_FLAGS] > + cmp w1, #1 // == PREEMPT_OFFSET Can we at least drop a BUILD_BUG_ON() somewhere to check this? Maybe in kernel_neon_begin() since this is intimately kernel-mode NEON related. > + csel x0, x0, xzr, eq > + tbnz x0, #TIF_NEED_RESCHED, 5555f // needs rescheduling? > +#endif A comment that we will fall through to 6666f here may be helpful. > + .subsection 1 > +5555: > + .endm > + > + .macro do_cond_yield_neon > + bl kernel_neon_end > + bl kernel_neon_begin > + .endm > + > + .macro endif_yield_neon, lbl=6666f > + b \lbl > + .previous > +6666: Could have slightly more random "random" labels here, but otherwise it looks ok to me. I might go through and replace all the random labels with something more robust sometime, but I've never been sure it was worth the effort... Cheers ---Dave