On Tue, Dec 05, 2017 at 06:04:34PM +0000, Ard Biesheuvel wrote: > On 5 December 2017 at 12:45, Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> wrote: > > > > > >> On 5 Dec 2017, at 12:28, Dave Martin <Dave.Martin@xxxxxxx> wrote: > >> > >>> On Mon, Dec 04, 2017 at 12:26:37PM +0000, Ard Biesheuvel wrote: > >>> Add a support macro to conditionally yield the NEON (and thus the CPU) > >>> that may be called from the assembler code. Given that especially the > >>> instruction based accelerated crypto code may use very tight loops, add > >>> some parametrization so that the TIF_NEED_RESCHED flag test is only > >>> executed every so many loop iterations. > >>> > >>> In some cases, yielding the NEON involves saving and restoring a non > >>> trivial amount of context (especially in the CRC folding algorithms), > >>> and so the macro is split into two, and the code in between is only > >>> executed when the yield path is taken, allowing the contex to be preserved. > >>> The second macro takes a label argument that marks the resume-from-yield > >>> path, which should restore the preserved context again. > >>> > >>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> > >>> --- > >>> arch/arm64/include/asm/assembler.h | 50 ++++++++++++++++++++ > >>> 1 file changed, 50 insertions(+) > >>> > >>> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h > >>> index aef72d886677..917b026d3e00 100644 > >>> --- a/arch/arm64/include/asm/assembler.h > >>> +++ b/arch/arm64/include/asm/assembler.h > >>> @@ -512,4 +512,54 @@ alternative_else_nop_endif > >>> #endif > >>> .endm > >>> > >>> +/* > >>> + * yield_neon - check whether to yield to another runnable task from > >>> + * kernel mode NEON code (running with preemption disabled) > >>> + * > >>> + * - Check whether the preempt count is exactly 1, in which case disabling > >>> + * preemption once will make the task preemptible. If this is not the case, > >>> + * yielding is pointless. > >>> + * - Check whether TIF_NEED_RESCHED is set, and if so, disable and re-enable > >>> + * kernel mode NEON (which will trigger a reschedule), and branch to the > >>> + * yield fixup code at @lbl. > >>> + */ > >>> + .macro yield_neon, lbl:req, ctr, order, stride, loop > >>> + yield_neon_pre \ctr, \order, \stride, \loop > >>> + yield_neon_post \lbl > >>> + .endm > >>> + > >>> + .macro yield_neon_pre, ctr, order=0, stride, loop=4444f > >>> +#ifdef CONFIG_PREEMPT > >>> + /* > >>> + * With some algorithms, it makes little sense to poll the > >>> + * TIF_NEED_RESCHED flag after every iteration, so only perform > >>> + * the check every 2^order strides. > >>> + */ > >>> + .if \order > 1 > >>> + .if (\stride & (\stride - 1)) != 0 > >>> + .error "stride should be a power of 2" > >>> + .endif > >>> + tst \ctr, #((1 << \order) * \stride - 1) & ~(\stride - 1) > >>> + b.ne \loop > >>> + .endif > >> > >> I'm not sure what baking in this check gives us, and this seems to > >> conflate two rather different things: yielding and defining a > >> "heartbeat" frequency for the calling code. > >> > >> Can we separate out the crypto-loop-helper semantics from the yield- > >> neon stuff? > >> > > > > Fair enough. I incorporated the check here so it disappears from the code entirely when !CONFIG_PREEMPT, because otherwise, you end up with a sequence that is mispredicted every # iterations without any benefit. > > I guess i could macroise that separately though. > > > >> If we had > >> > >> if_cond_yield_neon > >> // patchup code > >> endif_yield_neon > >> > > I like this, but ... > > >> then the caller is free to conditionally branch over that as appropriate > >> like > >> > >> loop: > >> // crypto stuff > >> tst x0, #0xf > >> b.ne loop > >> > >> if_cond_yield_neon > >> // patchup code > >> endif_cond_yield_neon > >> > > I need to put the post patchup code somewhere too. Please refer to the > CRC patches for the best examples of this. I'm not sure I follow. If you need to do something different after patchup, can't you either pull that code into the if...endif too, or otherwise put a branch before the endif? [...] Cheers ---Dave