On Thu, Jan 24, 2019 at 07:27:12PM +0100, Ard Biesheuvel wrote: > The SIMD routine ported from x86 used to have a special code path > for inputs < 16 bytes, which got lost somewhere along the way. > Instead, the current glue code aligns the input pointer to permit > the NEON routine to use special versions of the vld1 instructions > that assume 16 byte alignment, but this could result in inputs of > less than 16 bytes to be passed in. This description doesn't quite match the patch since the arm64 version of the assembly doesn't use any alignment specifiers. I take it that actually means the alignment in the glue code wasn't necessary in the first place? > This not only fails the new > extended tests that Eric has implemented, it also results in the > code reading before the input pointer, which could potentially > result in crashes when dealing with less than 16 bytes of input > at the start of a page which is preceded by an unmapped page. > > So update the glue code to only invoke the NEON routine if the > input is more than 16 bytes. > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> Can you add: Fixes: 6ef5737f3931 ("crypto: arm64/crct10dif - port x86 SSE implementation to arm64") Cc: stable@xxxxxxxxxxxxxxx > --- > arch/arm64/crypto/crct10dif-ce-glue.c | 25 +++++--------------- > 1 file changed, 6 insertions(+), 19 deletions(-) > > diff --git a/arch/arm64/crypto/crct10dif-ce-glue.c b/arch/arm64/crypto/crct10dif-ce-glue.c > index b461d62023f2..567c24f3d224 100644 > --- a/arch/arm64/crypto/crct10dif-ce-glue.c > +++ b/arch/arm64/crypto/crct10dif-ce-glue.c > @@ -39,26 +39,13 @@ static int crct10dif_update(struct shash_desc *desc, const u8 *data, > unsigned int length) > { > u16 *crc = shash_desc_ctx(desc); > - unsigned int l; > > - if (unlikely((u64)data % CRC_T10DIF_PMULL_CHUNK_SIZE)) { > - l = min_t(u32, length, CRC_T10DIF_PMULL_CHUNK_SIZE - > - ((u64)data % CRC_T10DIF_PMULL_CHUNK_SIZE)); > - > - *crc = crc_t10dif_generic(*crc, data, l); > - > - length -= l; > - data += l; > - } > - > - if (length > 0) { > - if (may_use_simd()) { > - kernel_neon_begin(); > - *crc = crc_t10dif_pmull(*crc, data, length); > - kernel_neon_end(); > - } else { > - *crc = crc_t10dif_generic(*crc, data, length); > - } > + if (length >= CRC_T10DIF_PMULL_CHUNK_SIZE && may_use_simd()) { > + kernel_neon_begin(); > + *crc = crc_t10dif_pmull(*crc, data, length); > + kernel_neon_end(); > + } else { > + *crc = crc_t10dif_generic(*crc, data, length); > } > > return 0; > -- > 2.17.1 >