On Tue, May 05, 2020 at 03:53:45PM +0200, Arnd Bergmann wrote: > When building for ARMv7-M, clang-9 or higher tries to unroll some loops, > which ends up confusing the register allocator to the point of generating > rather bad code and using more than the warning limit for stack frames: > > warning: stack frame size of 1200 bytes in function 'blake2b_compress' [-Wframe-larger-than=] > > Forcing it to not unroll the final loop avoids this problem. > > Fixes: 91d689337fe8 ("crypto: blake2b - add blake2b generic implementation") > Signed-off-by: Arnd Bergmann <arnd@xxxxxxxx> > --- > crypto/blake2b_generic.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/crypto/blake2b_generic.c b/crypto/blake2b_generic.c > index 1d262374fa4e..0ffd8d92e308 100644 > --- a/crypto/blake2b_generic.c > +++ b/crypto/blake2b_generic.c > @@ -129,7 +129,9 @@ static void blake2b_compress(struct blake2b_state *S, > ROUND(9); > ROUND(10); > ROUND(11); > - > +#ifdef CONFIG_CC_IS_CLANG Given your comment in the bug: "The code is written to assume no loops are unrolled" Does it make sense to make this unconditional and take compiler heuristics out of it? > +#pragma nounroll /* https://bugs.llvm.org/show_bug.cgi?id=45803 */ > +#endif > for (i = 0; i < 8; ++i) > S->h[i] = S->h[i] ^ v[i] ^ v[i + 8]; > } > -- > 2.26.0 >