On 22 November 2017 at 21:29, Eric Biggers <ebiggers3@xxxxxxxxx> wrote: > On Wed, Nov 22, 2017 at 08:51:57PM +0000, Ard Biesheuvel wrote: >> On 22 November 2017 at 19:51, Eric Biggers <ebiggers3@xxxxxxxxx> wrote: >> > From: Eric Biggers <ebiggers@xxxxxxxxxx> >> > >> > When chacha20_block() outputs the keystream block, it uses 'u32' stores >> > directly. However, the callers (crypto/chacha20_generic.c and >> > drivers/char/random.c) declare the keystream buffer as a 'u8' array, >> > which is not guaranteed to have the needed alignment. >> > >> > Fix it by having both callers declare the keystream as a 'u32' array. >> > For now this is preferable to switching over to the unaligned access >> > macros because chacha20_block() is only being used in cases where we can >> > easily control the alignment (stack buffers). >> > >> >> Given this paragraph, I think we agree the correct way to fix this >> would be to make chacha20_block() adhere to its prototype, so if we >> deviate from that, there should be a good reason. On which >> architecture that cares about alignment is this expected to result in >> a measurable performance benefit? >> > > Well, variables on the stack tend to be 4 or even 8-byte aligned anyway, so this > change probably doesn't make a difference in practice currently. But it still > should be fixed, in case it does become a problem. > Agreed. > We could certainly leave the type as u8 array and use put_unaligned_le32() > instead; that would be a simpler change. But that would be slower on > architectures where a potentially-unaligned access requires multiple > instructions. > The access itself would be slower, yes. But given the amount of work performed in chacha20_block(), I seriously doubt that would actually matter in practice.