These architectures select HAVE_EFFICIENT_UNALIGNED_ACCESS: s390 arm arm64 powerpc x86 x86_64 So, these will use the original old code. The architectures that will thus use the new code are: alpha arc avr32 blackfin c6x cris frv h7300 hexagon ia64 m32r m68k metag microblaze mips mn10300 nios2 openrisc parisc score sh sparc tile um unicore32 xtensa Unfortunately, of these, the only machines I have access to are MIPS. My SPARC access went cold a few years ago. If you insist on a data-motivated approach approach, then I fear my test of 1 out of 26 different architectures is woefully insufficient. Does anybody else on the list have access to more hardware and is interested in benchmarking? If not, is there a reasonable way to decide on this by considering the added complexity of code? Are we able to reason best and worst cases of instruction latency vs unalignment stalls for most CPU designs? -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html