On 15 May 2014 14:47, Catalin Marinas <catalin.marinas@xxxxxxx> wrote: > On 15 May 2014, at 22:35, Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> wrote: >> On 15 May 2014 10:24, Catalin Marinas <catalin.marinas@xxxxxxx> wrote: >>> On Wed, May 14, 2014 at 07:17:29PM +0100, Ard Biesheuvel wrote: >>>> +static u8 const *sha1_do_update(struct shash_desc *desc, const u8 *data, >>>> + int blocks, u8 *head, unsigned int len) >>>> +{ >>>> + struct sha1_state *sctx = shash_desc_ctx(desc); >>>> + struct thread_info *ti = NULL; >>>> + >>>> + /* >>>> + * Pass current's thread info pointer to sha1_ce_transform() >>>> + * below if we want it to play nice under preemption. >>>> + */ >>>> + if ((IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY) || IS_ENABLED(CONFIG_PREEMPT)) >>>> + && (desc->flags & CRYPTO_TFM_REQ_MAY_SLEEP)) >>>> + ti = current_thread_info(); >>>> + >>>> + do { >>>> + int rem; >>>> + >>>> + kernel_neon_begin_partial(16); >>>> + rem = sha1_ce_transform(blocks, data, sctx->state, head, 0, ti); >>>> + kernel_neon_end(); >>>> + >>>> + data += (blocks - rem) * SHA1_BLOCK_SIZE; >>>> + blocks = rem; >>>> + head = NULL; >>>> + } while (unlikely(ti && blocks > 0)); >>>> + return data; >>>> +} >>> >>> What latencies are we talking about? Would it make sense to always >>> call cond_resched() even if preemption is disabled? >> >> Calculating the SHA256 (which is the most striking example) hash of a >> 4k buffer takes ~10 microseconds, so perhaps this is all a bit moot as >> I don't think there will generally be many in-kernel users hashing >> substantially larger quantities of data at a time. > > And what’s the maximum size? Could we do it in blocks of 4K? > Yes, that should be doable as well. Once we feel we actually need this feature, that is probably a better approach. >>> But I think we should have this loop always rescheduling if >>> CRYPTO_TFM_REQ_MAY_SLEEP. I can see there is a crypto_yield() function >>> that conditionally reschedules. What's the overhead of calling >>> sha1_ce_transform() in a loop vs a single call for the entire data? >> >> For SHA256 we lose at least 10% if we call into the core function for >> each 64 byte block rather than letting it chew away at all the data >> itself. > > I was thinking more like calling it for bigger buffers (e.g. page) at > once and a crypto_yield() in between. > >> (This is due to the fact that it needs to load 64 32-bit round >> keys) But perhaps (considering the above) it is better to just shelve >> the last 4 patches in this series until it becomes a problem for >> anyone. > > For the time being I would drop the last 4 patches. We can revisit them > for the next kernel version. > Agreed. Should I send an updated pull request? -- Ard. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html