On Sat, May 12, 2018 at 11:09 AM, Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> wrote: > (+ Arnd) > > On 12 May 2018 at 10:43, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote: >> On Fri, Feb 2, 2018 at 11:18 PM, Eric Biggers <ebiggers3@xxxxxxxxx> wrote: >>> On Fri, Feb 02, 2018 at 02:57:32PM +0100, Dmitry Vyukov wrote: >>>> On Fri, Feb 2, 2018 at 2:48 PM, syzbot >>>> <syzbot+ffa3a158337bbc01ff09@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote: >>>> > Hello, >>>> > >>>> > syzbot hit the following crash on upstream commit >>>> > 7109a04eae81c41ed529da9f3c48c3655ccea741 (Thu Feb 1 17:37:30 2018 +0000) >>>> > Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide >>>> > >>>> > So far this crash happened 4 times on net-next, upstream. >>>> > C reproducer is attached. >>>> > syzkaller reproducer is attached. >>>> > Raw console output is attached. >>>> > compiler: gcc (GCC) 7.1.1 20170620 >>>> > .config is attached. >>>> >>>> >>>> From suspicious frames I see salsa20_asm_crypt there, so +crypto maintainers. >>>> >>> >>> Looks like the x86 implementations of Salsa20 (both i586 and x86_64) need to be >>> updated to not use %ebp/%rbp. >> >> Ard, >> >> This was bisected as introduced by: >> >> commit 83dee2ce1ae791c3dc0c9d4d3a8d42cb109613f6 >> Author: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> >> Date: Fri Jan 19 12:04:34 2018 +0000 >> >> crypto: sha3-generic - rewrite KECCAK transform to help the >> compiler optimize >> >> https://gist.githubusercontent.com/dvyukov/47f93f5a0679170dddf93bc019b42f6d/raw/65beac8ddd30003bbd4e9729236dc8572094abf7/gistfile1.txt > > Ouch. > > I'm not an expert in x86 assembly. Could someone please check the > generated code to see what's going on? The C code changes are not that > intricate, they basically unroll a loop, replacing accesses to > 'array[indirect_index[i]]' with 'array[constant]'. > > As mentioned in the commit log, the speedup is more than significant > for architectures with lots of GPRs so I'd prefer fixing the patch > over reverting it (if there is anything wrong with the code in the > first place) I suspect the problem is with __attribute__((__optimize__("O3"))). It makes compiler use rbp register, which must not be used.