> So why is the performance hit acceptable for ESP but not for WPA? We > could easily implement the same thing, i.e., > kmalloc(GFP_ATOMIC)/kfree the aead_req struct rather than allocate it > on the stack Yeah, maybe we should. It's likely a much bigger allocation, but I don't actually know if that affects speed. In most cases where you want high performance we never hit this anyway since we'll have hardware crypto. I know for our (Intel's) devices we normally never hit these code paths. But on the other hand, you also did your changes for a reason, and the only reason I can see of that is performance. So you'd be the one with most "skin in the game", I guess? johannes