On Sat, Jan 29, 2022 at 2:45 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > Modern compilers are perfectly capable of extracting parallelism from > the XOR routines, provided that the prototypes reflect the nature of the > input accurately, in particular, the fact that the input vectors are > expected not to overlap. This is not documented explicitly, but is > implied by the interchangeability of the various C routines, some of > which use temporary variables while others don't: this means that these > routines only behave identically for non-overlapping inputs. > > So let's decorate these input vectors with the __restrict modifier, > which informs the compiler that there is no overlap. While at it, make > the input-only vectors pointer-to-const as well. > > Tested-by: Nathan Chancellor <nathan@xxxxxxxxxx> > Signed-off-by: Ard Biesheuvel <ardb@xxxxxxxxxx> Thanks for the patch! Reviewed-by: Nick Desaulniers <ndesaulniers@xxxxxxxxxx> I like how you renamed the parameters in arch/powerpc/include/asm/xor_altivec.h, arch/powerpc/lib/xor_vmx.h, and arch/powerpc/lib/xor_vmx_glue.c. It's not befitting for the suffix _in to be used when the first param is technically more of an "inout" param. Though, you might also want to update the parameter names in arch/powerpc/lib/xor_vmx.c. -- Thanks, ~Nick Desaulniers