On 15 May 2014, at 22:35, Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> wrote: > On 15 May 2014 10:24, Catalin Marinas <catalin.marinas@xxxxxxx> wrote: >> On Wed, May 14, 2014 at 07:17:29PM +0100, Ard Biesheuvel wrote: >>> +static u8 const *sha1_do_update(struct shash_desc *desc, const u8 *data, >>> + int blocks, u8 *head, unsigned int len) >>> +{ >>> + struct sha1_state *sctx = shash_desc_ctx(desc); >>> + struct thread_info *ti = NULL; >>> + >>> + /* >>> + * Pass current's thread info pointer to sha1_ce_transform() >>> + * below if we want it to play nice under preemption. >>> + */ >>> + if ((IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY) || IS_ENABLED(CONFIG_PREEMPT)) >>> + && (desc->flags & CRYPTO_TFM_REQ_MAY_SLEEP)) >>> + ti = current_thread_info(); >>> + >>> + do { >>> + int rem; >>> + >>> + kernel_neon_begin_partial(16); >>> + rem = sha1_ce_transform(blocks, data, sctx->state, head, 0, ti); >>> + kernel_neon_end(); >>> + >>> + data += (blocks - rem) * SHA1_BLOCK_SIZE; >>> + blocks = rem; >>> + head = NULL; >>> + } while (unlikely(ti && blocks > 0)); >>> + return data; >>> +} >> >> What latencies are we talking about? Would it make sense to always >> call cond_resched() even if preemption is disabled? > > Calculating the SHA256 (which is the most striking example) hash of a > 4k buffer takes ~10 microseconds, so perhaps this is all a bit moot as > I don't think there will generally be many in-kernel users hashing > substantially larger quantities of data at a time. And what’s the maximum size? Could we do it in blocks of 4K? >> But I think we should have this loop always rescheduling if >> CRYPTO_TFM_REQ_MAY_SLEEP. I can see there is a crypto_yield() function >> that conditionally reschedules. What's the overhead of calling >> sha1_ce_transform() in a loop vs a single call for the entire data? > > For SHA256 we lose at least 10% if we call into the core function for > each 64 byte block rather than letting it chew away at all the data > itself. I was thinking more like calling it for bigger buffers (e.g. page) at once and a crypto_yield() in between. > (This is due to the fact that it needs to load 64 32-bit round > keys) But perhaps (considering the above) it is better to just shelve > the last 4 patches in this series until it becomes a problem for > anyone. For the time being I would drop the last 4 patches. We can revisit them for the next kernel version. Catalin-- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html