Am Donnerstag, 9. August 2018, 21:21:32 CEST schrieb Theodore Y. Ts'o: Hi Theodore, > I'm wondering whether we have kernel code that actually tries to > extract more than 64 bytes, so I'm not sure how often we enter the > while loop at all. Out of curiosity, did you find this from code > inspection or via a kernel fault on some architecture that doesn't > allow unaligned u32 memory accesses? I found it based on inspection during performance assessments and considering that we have such alignment checks in the kernel crypto API. My LRNG implementation using ChaCha20 is between 20% and 120% faster. And I would like to have random.c benefit from such speed improvement as well. > > That should be > > if (likely(((unsigned long)buf & (sizeof(tmp[0]) - 1)) == 0) { > > shouldn't it? As said to Eric, my patch of course is reversed. > > If we *did* have callers who were calling get_random_bytes with nbytes > significantly larger than 64 bytes (CHACHA20_BLOCK_SIZE), I wonder if > we should be doing something like this: > > while (nbytes >= CHACHA20_BLOCK_SIZE) { > int adjust = (unsigned long)buf & (sizeof(tmp[0]) - 1); > > extract_crng(buf); Why this line? > buf += CHACHA20_BLOCK_SIZE; > if (likely(adjust == 0)) { > extract_crng(buf); > buf += CHACHA20_BLOCK_SIZE; > nbytes -= CHACHA20_BLOCK_SIZE; > } else { > extract_crng(tmp); > memcpy(buf, tmp, CHACHA20_BLOCK_SIZE - adjust); > buf += CHACHA20_BLOCK_SIZE - adjust; > nbytes -= CHACHA20_BLOCK_SIZE - adjust; Sure, why not. > } > > } > > This may be a hyper optimization, though --- it's not clear how often, > say the kernel would be calling get_random_bytes with size >> 64 at > all, never mind with an unaligned buffer. I agree it is not likely that we have unaligned buffers. But in case we have, we have the potential to overwrite memory that does not belong to us with unknown consequences. Ciao Stephan