On Wed, Mar 18, 2020 at 08:27:32PM -0600, Jason A. Donenfeld wrote: > Prior, passing in chunks of 2, 3, or 4, followed by any additional > chunks would result in the chacha state counter getting out of sync, > resulting in incorrect encryption/decryption, which is a pretty nasty > crypto vuln: "why do images look weird on webpages?" WireGuard users > never experienced this prior, because we have always, out of tree, used > a different crypto library, until the recent Frankenzinc addition. This > commit fixes the issue by advancing the pointers and state counter by > the actual size processed. It also fixes up a bug in the (optional, > costly) stride test that prevented it from running on arm64. > > Fixes: b3aad5bad26a ("crypto: arm64/chacha - expose arm64 ChaCha routine as library function") > Reported-and-tested-by: Emil Renner Berthing <kernel@xxxxxxxx> > Cc: Ard Biesheuvel <ardb@xxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx # v5.5+ > Signed-off-by: Jason A. Donenfeld <Jason@xxxxxxxxx> > --- > arch/arm64/crypto/chacha-neon-glue.c | 8 ++++---- > lib/crypto/chacha20poly1305-selftest.c | 11 ++++++++--- > 2 files changed, 12 insertions(+), 7 deletions(-) > > diff --git a/arch/arm64/crypto/chacha-neon-glue.c b/arch/arm64/crypto/chacha-neon-glue.c > index c1f9660d104c..37ca3e889848 100644 > --- a/arch/arm64/crypto/chacha-neon-glue.c > +++ b/arch/arm64/crypto/chacha-neon-glue.c > @@ -55,10 +55,10 @@ static void chacha_doneon(u32 *state, u8 *dst, const u8 *src, > break; > } > chacha_4block_xor_neon(state, dst, src, nrounds, l); > - bytes -= CHACHA_BLOCK_SIZE * 5; > - src += CHACHA_BLOCK_SIZE * 5; > - dst += CHACHA_BLOCK_SIZE * 5; > - state[12] += 5; > + bytes -= l; > + src += l; > + dst += l; > + state[12] += DIV_ROUND_UP(l, CHACHA_BLOCK_SIZE); > } > } > > diff --git a/lib/crypto/chacha20poly1305-selftest.c b/lib/crypto/chacha20poly1305-selftest.c > index c391a91364e9..fa43deda2660 100644 > --- a/lib/crypto/chacha20poly1305-selftest.c > +++ b/lib/crypto/chacha20poly1305-selftest.c > @@ -9028,10 +9028,15 @@ bool __init chacha20poly1305_selftest(void) > && total_len <= 1 << 10; ++total_len) { > for (i = 0; i <= total_len; ++i) { > for (j = i; j <= total_len; ++j) { > + k = 0; > sg_init_table(sg_src, 3); > - sg_set_buf(&sg_src[0], input, i); > - sg_set_buf(&sg_src[1], input + i, j - i); > - sg_set_buf(&sg_src[2], input + j, total_len - j); > + if (i) > + sg_set_buf(&sg_src[k++], input, i); > + if (j - i) > + sg_set_buf(&sg_src[k++], input + i, j - i); > + if (total_len - j) > + sg_set_buf(&sg_src[k++], input + j, total_len - j); > + sg_init_marker(sg_src, k); > memset(computed_output, 0, total_len); > memset(input, 0, total_len); > Reviewed-by: Eric Biggers <ebiggers@xxxxxxxxxx> Herbert, can you send this to Linus for 5.6? - Eric