On Mon, Jan 31, 2022 at 10:13 AM Nick Desaulniers <ndesaulniers@xxxxxxxxxx> wrote: > > On Sat, Jan 29, 2022 at 2:45 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > Modern compilers are perfectly capable of extracting parallelism from > > the XOR routines, provided that the prototypes reflect the nature of the > > input accurately, in particular, the fact that the input vectors are > > expected not to overlap. This is not documented explicitly, but is > > implied by the interchangeability of the various C routines, some of > > which use temporary variables while others don't: this means that these > > routines only behave identically for non-overlapping inputs. > > > > So let's decorate these input vectors with the __restrict modifier, > > which informs the compiler that there is no overlap. While at it, make > > the input-only vectors pointer-to-const as well. > > > > Tested-by: Nathan Chancellor <nathan@xxxxxxxxxx> > > Signed-off-by: Ard Biesheuvel <ardb@xxxxxxxxxx> > > Thanks for the patch! > Reviewed-by: Nick Desaulniers <ndesaulniers@xxxxxxxxxx> > > I like how you renamed the parameters in > arch/powerpc/include/asm/xor_altivec.h, arch/powerpc/lib/xor_vmx.h, > and arch/powerpc/lib/xor_vmx_glue.c. It's not befitting for the > suffix _in to be used when the first param is technically more of an > "inout" param. Though, you might also want to update the parameter > names in arch/powerpc/lib/xor_vmx.c. Also, this patch fixes an instance of -Wframe-larger-than we observed with ppc: Link: https://github.com/ClangBuiltLinux/linux/issues/563 -- Thanks, ~Nick Desaulniers