On Mon, 14 Oct 2019 at 16:14, Jason A. Donenfeld <Jason@xxxxxxxxx> wrote: > > Hi Ard, > > On Mon, Oct 7, 2019 at 6:46 PM Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> wrote: > > Arnd reports that the 32-bit generic library code for Curve25119 ends > > up using an excessive amount of stack space when built with Clang: > > > > lib/crypto/curve25519-fiat32.c:756:6: error: stack frame size > > of 1384 bytes in function 'curve25519_generic' > > [-Werror,-Wframe-larger-than=] > > > > Let's give some hints to the compiler regarding which routines should > > not be inlined, to prevent it from running out of registers and spilling > > to the stack. The resulting code performs identically under both GCC > > and Clang, and makes the warning go away. > > Are you *sure* about that? Couldn't we fix clang instead? I'd rather > fixes go there instead of gimping this. The reason is that I noticed > before that this code, performance-wise, was very inlining sensitive. > Can you benchmark this on ARM32-noneon and on MIPS32? If there's a > performance difference there, then maybe you can defer this part of > the series until after the rest lands, and then we'll discuss at > length various strategies? Alternatively, if you benchmark those and > it also makes no difference, then it indeed makes no difference. > I tested this using a 32-bit ARM VM running under an 64-bit KVM hypervisor, doing 100 iterations of the selftest.