On 01/22/2017 20:03, Joshua Kinard wrote: > On 01/22/2017 18:28, Joshua Kinard wrote: >> I think I've run into a really odd gcc-6.3.x miscompile bug here on IP27. >> But I'm not sure. I've reproduced the issue on 4.9.5, 4.8.17, and now >> 4.7.10 (which I KNOW should boot). If I recompile the same 4.7.10 kernel >> with gcc-5.4.0, though, it boots as expected. The fault appears to be in >> the assembly for _raw_spin_lock_irq. >> > > Figured it out. Not 100% sure WHY, but gcc-6.3.x is causing kbuild to parse > the arch/mips/sgi-ip32/Platform file for some reason on both IP27 and IP30 > builds, and is thusly appending -mr10k-cache-barrier=load-store to the kernel > CFLAGS. It did this on my Octane's kernel as well, but the Octane seems to be > unaffected by the extraneous cache barriers. I sent a fix in for this a long > time ago, but it never got accepted. So I'll try again... > Nope. I was wrong. Still happens even after fixing the erroneous mr10k-cache-barrier thing. I'll send a patch in for that later now, but looking at other sections of disassembly, I am see a pattern of this "sd zero,..." instruction being placed at the beginning of most functions, before most "daddiu" instructions. I even test-compiled a vanilla kernel as well, and the same issue is happening there when looking at disassembly (test boot also Oopses): Examples: a80000000001c400 <run_init_process>: a80000000001c400: ffa0bff0 sd zero,-16400(sp) a80000000001c404: 67bdfff0 daddiu sp,sp,-16 a80000000001d740 <per_cpu_init>: a80000000001d740: ffa0bfc0 sd zero,-16448(sp) a80000000001d744: 2405ffc9 li a1,-55 a80000000001d748: 67bdffc0 daddiu sp,sp,-64 a80000000001cea0 <ip27_be_handler>: a80000000001cea0: ffa0bfe0 sd zero,-16416(sp) a80000000001cea4: 67bdffe0 daddiu sp,sp,-32 a8000000000256c0 <__compute_return_epc>: a8000000000256c0: ffa0bff0 sd zero,-16400(sp) a8000000000256c4: 67bdfff0 daddiu sp,sp,-16 a80000000001c5b0 <name_to_dev_t>: a80000000001c5b0: ffa0bf90 sd zero,-16496(sp) a80000000001c5b4: 3c05a800 lui a1,0xa800 a80000000001c5b8: 3c020074 lui v0,0x74 a80000000001c5bc: 64a50000 daddiu a1,a1,0 a80000000001c5c0: 64424840 daddiu v0,v0,18496 a80000000001c5c4: 0005283c dsll32 a1,a1,0x0 a80000000001c5c8: 67bdff90 daddiu sp,sp,-112 I am not sure what to call this. This is definitely not happening with a gcc-5.4.x-built kernel, so it's a code-generation issue of some kind: a80000000001c400 <run_init_process>: a80000000001c400: 67bdfff0 daddiu sp,sp,-16 a80000000001c404: 3c02007b lui v0,0x7b a80000000001cec0 <ip27_be_handler>: a80000000001cec0: 67bdffe0 daddiu sp,sp,-32 a80000000001cec4: ffb00000 sd s0,0(sp) a80000000001c5a0 <name_to_dev_t>: a80000000001c5a0: 3c05a800 lui a1,0xa800 a80000000001c5a4: 3c020075 lui v0,0x75 a80000000001c5a8: 64a50000 daddiu a1,a1,0 a80000000001c5ac: 64423f40 daddiu v0,v0,16192 a80000000001c5b0: 0005283c dsll32 a1,a1,0x0 a80000000001c5b4: 67bdff90 daddiu sp,sp,-112 Oddly enough, Octane is definitely not bothered by this extraneous store-doubleword instruction. Only IP27 appears to be, which may explain why it's gone unnoticed thus far. Maybe NUMA-related? -- Joshua Kinard Gentoo/MIPS kumba@xxxxxxxxxx 6144R/F5C6C943 2015-04-27 177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943 "The past tempts us, the present confuses us, the future frightens us. And our lives slip away, moment by moment, lost in that vast, terrible in-between." --Emperor Turhan, Centauri Republic