Re: [PATCH] crypto: x86/twofish-3way - Fix %rbp usage

Ingo Molnar <mingo@xxxxxxxxxx> · Tue, 19 Dec 2017 08:54:43 +0100

* Eric Biggers <ebiggers3@xxxxxxxxx> wrote:

> There may be a small overhead caused by replacing 'xchg REG, REG' with
> the needed sequence 'mov MEM, REG; mov REG, MEM; mov REG, REG' once per
> round.  But, counterintuitively, when I tested "ctr-twofish-3way" on a
> Haswell processor, the new version was actually about 2% faster.
> (Perhaps 'xchg' is not as well optimized as plain moves.)

XCHG has implicit LOCK semantics on all x86 CPUs, so that's not a surprising 
result I think.

Thanks,

	Ingo