Hi! The x86 AES crypto (gcm(aes)) requires 16B alignment which is hard to achieve in networking. Is there any reason for this? On any moderately recent Intel platform aligned and unaligned vmovdq should have the same performance (reportedly). I'll hack it up and do some testing, but I thought it's worth asking first..