On Wed, 19 Aug 2020 at 00:39, Ben Greear <greearb@xxxxxxxxxxxxxxx> wrote: > > On 8/18/20 3:33 PM, Herbert Xu wrote: > > On Tue, Aug 18, 2020 at 03:31:10PM -0700, Ben Greear wrote: > >> > >> I don't think it has been discussed recently, but mac80211 is already > >> a complicated beast, so if this added any significant complexity > >> it might not be worth it. > > > > Any bulk data path should be using the async interface, otherwise > > performance will seriously suffer should SIMD be unavailable. I > > think someone should look at converting wireless to async like IPsec. > > Most users in most cases are using hw crypt, so that is likely why > it hasn't gotten a huge amount of effort to optimize the software > crypt path. > As I understand it, it is highly unusual for these code paths to be exercised in the first place. All mac80211 hardware anyone still cares about supports CCMP offload, and so only in special cases like Ben's do we need the software implementation. Also, in Ben's case, it is on a hot path whereas obsolete hardware that does not implement CCMP offload does not support anything over 11g speeds to begin with. Then, there is the additional issue where the FPU preserve/restore appears to be disproportionately expensive on the actual SoC in question. My preferred approach for cbcmac(aes-ni) would be to mirror the arm64 exactly, which means going through the data only a single time, and interleave the CTR and CBCMAC operations at the AES round level. My cbcmac ahash approach (v2) is plan B as far as I am concerned, but it turns out to be flawed and I haven't had time to look into this. But if we look at the actual issue at hand, we might also look into amortizing the FPU preserve/restore over multiple invocations of a cipher. I proposed a patch a while ago that makes cipher an internal crypto API abstraction, and we could easily add pre/post hooks that preserve/restore the FPU in this case, in which case we would not need any changes at higher levels. > If someone wants to give this async api a try for mac80211, I can > test, and I can sponsor the work, but I don't have time to try > to implement it myself. > > Thanks, > Ben > > > > >> Truth is though, I know very little about what changes would be > >> needed to make it do async decrypt, so maybe it would be a simple > >> matter? > > > > IPsec was actually quite straightforward. > > > > Cheers, > > > > > -- > Ben Greear <greearb@xxxxxxxxxxxxxxx> > Candela Technologies Inc http://www.candelatech.com