On Sat, Aug 27, 2022 at 03:59:37PM -0700, Josh Poimboeuf wrote: > > > While working on another s390 issue, I was getting intermittent boot > > > failures in __nospec_revert() when it tried to access 'instr[0]'. I > > > noticed the __nospec_call_start address ended in 'ff'. This patch > > > seemed to fix it. I have no idea why it was (only sometimes) failing in > > > the first place. ... > > > + . = ALIGN(4); > > > .nospec_call_table : { > > > __nospec_call_start = . ; > > > *(.s390_indirect*) > > ... > > Unfortunately I was unable to let any compiler generate code, that > > would use the larl instruction. Instead the address of > > nospec_call_table was loaded indirectly via the GOT, which again works > > always, regardless if the table starts at an even or uneven address. > > > > This needs to be fixed anyway, and your patch certainly is correct. > > > > Could you maybe share your kernel config + compiler version, if you > > are still able to reproduce this? > > I think the trick is to disable CONFIG_RELOCATABLE. When I compile with > CONFIG_RELOCATABLE=n and "gcc version 11.3.1 20220421 (Red Hat 11.3.1-2) > (GCC)", I get the following in nospec_init_branches(): > > 2a8: c0 20 00 00 00 00 larl %r2,2a8 <nospec_init_branches+0x30> 2aa: R_390_PC32DBL __nospec_call_start+0x2 > > That said, I still haven't been able to figure out how to recreate the > program check in __nospec_revert(), even when the nospec_call_table > starts at an odd offset. Right, CONFIG_RELOCATABLE=n will do the trick. I don't know why you cannot recreate it, however on my system it crashes instantly when I make sure that __nospec_call_start starts at an odd address. Apparently 'instr = (u8 *) epo + *epo;' in __nospec_revert() may result in a very large address, since without KASLR the kernel is located at a low address, and it only takes one entry within the incorrectly accessed nospec_call_table which results in a large negative value for '*epo' and we end up with an overflow and a very large address for 'instr'. This will then result in the program check / addressing exception you've seen when the kernel tried to access 'instr[0]'. I'll apply your patch. Thanks a lot for debugging and reporting!