To revive this... On Fri, Aug 10, 2018 at 08:27:58AM +0200, Stephan Mueller wrote: > Am Donnerstag, 9. August 2018, 21:40:12 CEST schrieb Eric Biggers: > > Hi Eric, > > > while (bytes >= CHACHA20_BLOCK_SIZE) { > > chacha20_block(state, stream); > > - crypto_xor(dst, (const u8 *)stream, CHACHA20_BLOCK_SIZE); > > + crypto_xor(dst, stream, CHACHA20_BLOCK_SIZE); > > If we are at it, I am wondering whether we should use crypto_xor. At this > point we exactly know that the data is CHACHA20_BLOCK_SIZE bytes in length > which is divisible by u32. Hence, shouldn't we disregard crypto_xor in favor > of a loop iterating in 32 bits words? crypto_xor contains some checks for > trailing bytes which we could spare. crypto_xor() here is fine. It already meets the conditions for the inlined version that XOR's a long at a time: if (IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && __builtin_constant_p(size) && (size % sizeof(unsigned long)) == 0) { unsigned long *d = (unsigned long *)dst; unsigned long *s = (unsigned long *)src; while (size > 0) { *d++ ^= *s++; size -= sizeof(unsigned long); } } And regardless, it's better to optimize crypto_xor() once, than all the callers. > > > bytes -= CHACHA20_BLOCK_SIZE; > > dst += CHACHA20_BLOCK_SIZE; > > } > > if (bytes) { > > chacha20_block(state, stream); > > - crypto_xor(dst, (const u8 *)stream, bytes); > > + crypto_xor(dst, stream, bytes); > > Same here. 'bytes' need not be a multiple of sizeof(u32) or sizeof(long), and 'dst' can have any alignment... So we should just call crypto_xor() which does the right thing, and is intended to do so as efficiently as possible. > > > @@ -1006,14 +1006,14 @@ static void _crng_backtrack_protect(struct > > crng_state *crng, used = 0; > > } > > spin_lock_irqsave(&crng->lock, flags); > > - s = &tmp[used / sizeof(__u32)]; > > + s = (__u32 *) &tmp[used]; > > As Yann said, wouldn't you have the alignment problem here again? > > Somehow, somebody must check the provided input buffer at one time. > I guess we should just explicitly align the 'tmp' buffers in _get_random_bytes() and extract_crng_user(). - Eric