> > + while (srclen >= GHASH_BLOCK_SIZE) { > > + unsigned int fpulen = min(srclen, FPU_BYTES); > > + > > + kernel_fpu_begin(); > > + while (fpulen >= GHASH_BLOCK_SIZE) { > > + int n = min_t(unsigned int, fpulen, GHASH_BLOCK_SIZE); > > + > > + clmul_ghash_update(dst, src, n, &ctx->shash); > > + > > + srclen -= n; > > + fpulen -= n; > > + src += n; > > + } > > + kernel_fpu_end(); > > + } > > Another loop that doesn't make sense. Why is this only passing 16 bytes at a > time into the assembly code? There shouldn't be an inner loop here at all. Thanks, copied the pattern from another function whose assembly function had a size limit. clmul_ghash_update looks ready for all sizes, so I'll simplify that.