On 14 October 2016 at 11:00, Johannes Berg <johannes@xxxxxxxxxxxxxxxx> wrote: > >> So why is the performance hit acceptable for ESP but not for WPA? We >> could easily implement the same thing, i.e., >> kmalloc(GFP_ATOMIC)/kfree the aead_req struct rather than allocate it >> on the stack > > Yeah, maybe we should. It's likely a much bigger allocation, but I > don't actually know if that affects speed. > > In most cases where you want high performance we never hit this anyway > since we'll have hardware crypto. I know for our (Intel's) devices we > normally never hit these code paths. > > But on the other hand, you also did your changes for a reason, and the > only reason I can see of that is performance. So you'd be the one with > most "skin in the game", I guess? > Well, what sucks here is that the accelerated driver I implemented for arm64 does not actually need this, as long as we take aad[] off the stack. And note that the API was changed since my patch, to add aad[] to the scatterlist: prior to this change, it used aead_request_set_assoc() to set the associated data separately.