Am Donnerstag, 9. August 2018, 21:40:12 CEST schrieb Eric Biggers: Hi Eric, > while (bytes >= CHACHA20_BLOCK_SIZE) { > chacha20_block(state, stream); > - crypto_xor(dst, (const u8 *)stream, CHACHA20_BLOCK_SIZE); > + crypto_xor(dst, stream, CHACHA20_BLOCK_SIZE); If we are at it, I am wondering whether we should use crypto_xor. At this point we exactly know that the data is CHACHA20_BLOCK_SIZE bytes in length which is divisible by u32. Hence, shouldn't we disregard crypto_xor in favor of a loop iterating in 32 bits words? crypto_xor contains some checks for trailing bytes which we could spare. > bytes -= CHACHA20_BLOCK_SIZE; > dst += CHACHA20_BLOCK_SIZE; > } > if (bytes) { > chacha20_block(state, stream); > - crypto_xor(dst, (const u8 *)stream, bytes); > + crypto_xor(dst, stream, bytes); Same here. > @@ -1006,14 +1006,14 @@ static void _crng_backtrack_protect(struct > crng_state *crng, used = 0; > } > spin_lock_irqsave(&crng->lock, flags); > - s = &tmp[used / sizeof(__u32)]; > + s = (__u32 *) &tmp[used]; As Yann said, wouldn't you have the alignment problem here again? Somehow, somebody must check the provided input buffer at one time. Ciao Stephan