On Tue, Aug 28, 2012 at 02:24:43PM +0300, Jussi Kivilinna wrote: > Patch replaces 'movb' instructions with 'movzbl' to break false register > dependencies and interleaves instructions better for out-of-order scheduling. > > Tested on Intel Core i5-2450M and AMD FX-8100. > > tcrypt ECB results: > > Intel Core i5-2450M: > > size old-vs-new new-vs-3way old-vs-3way > enc dec enc dec enc dec > 256 1.12x 1.13x 1.36x 1.37x 1.21x 1.22x > 1k 1.14x 1.14x 1.48x 1.49x 1.29x 1.31x > 8k 1.14x 1.14x 1.50x 1.52x 1.32x 1.33x > > AMD FX-8100: > > size old-vs-new new-vs-3way old-vs-3way > enc dec enc dec enc dec > 256 1.10x 1.11x 1.01x 1.01x 0.92x 0.91x > 1k 1.11x 1.12x 1.08x 1.07x 0.97x 0.96x > 8k 1.11x 1.13x 1.10x 1.08x 0.99x 0.97x > > [v2] > - Do instruction interleaving another way to avoid adding new FPU<=>CPU > register moves as these cause performance drop on Bulldozer. > - Further interleaving improvements for better out-of-order scheduling. All three patches applied. Thanks! -- Email: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html