Hi, Sebastian, On Wed, 2008-04-30 at 00:12 +0200, Sebastian Siewior wrote: > * Huang, Ying | 2008-04-25 11:11:17 [+0800]: > > >Hi, Sebastian, > Hi Huang, > > sorry for the delay. > > >I changed the patches to group the read or write together instead of > >interleaving. Can you help me to test these new patches? The new patches > >is attached with the mail. > The new results are attached. It seems that the performance degradation between step4 to step5 is decreased. But the overall performance degradation between step0 to step7 is still about 5%. I also test the patches on Pentium 4 CPUs, and the performance decreased too. So I think this optimization is CPU micro-architecture dependent. While the dependency between instructions are reduced, more registers (at most 3) are saved/restored before/after encryption/decryption. If the CPU has no extra execution unit for newly independent instructions but more registers are saved/restored, the performance will decrease. We maybe should select different implementation based on micro-architecture. Best Regards, Huang Ying -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html