On 05/23/2016 11:20, Ralf Baechle wrote: > On Mon, May 23, 2016 at 06:13:46PM +0300, Aaro Koskinen wrote: > >> I'm getting kernel crashes (see below) reliably when building Perl in >> parallel (make -j16) on OCTEON EBH5600 board (8 cores, 4 GB RAM) with >> Linux 4.6. >> >> It seems that CONFIG_TRANSPARENT_HUGEPAGE has something to do with the >> issue - disabling it makes build go through fine. >> >> Any ideas? > > I thought it was working except on SGI Origin 200/2000 aka IP27 where > Joshua Kinard (added to cc) was hitting issues as well. > > Joshua, does that similar to the issues you were hitting? > > Ralf NAK, this issue looks completely different to IP30/IP27. In this case, it looks like the hardware is detecting the case where multiple TLB entries match and it's killing the machine to avoid hardware damage. I don't want to know how the SGI systems handle this scenario (does the R10000 do a TLB shutdown??). On IP30, using THP usually results in instruction bus errors (IBE), after a set time, depending on the machine's configuration (<2GB RAM, virtually instant on userland init; >2GB RAM, might survive for a few minutes, even getting all the way to runlevel 3 randomly). IP27 was somewhat similar to IP30, in that THP usually results in IBEs after a few seconds of hitting userland bringup (bash is pretty quick at triggering an IBE), but I haven't tried experimenting with varying the amount of RAM in that machine, due to the fragility of pulling the nodeboards out constantly. I also haven't tried THP since refactoring/rewriting the IP27 code back in Feb to see if I magically fixed it... -- Joshua Kinard Gentoo/MIPS kumba@xxxxxxxxxx 6144R/F5C6C943 2015-04-27 177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943 "The past tempts us, the present confuses us, the future frightens us. And our lives slip away, moment by moment, lost in that vast, terrible in-between." --Emperor Turhan, Centauri Republic