Re: THP broken on OCTEON?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/23/2016 11:52 AM, Aaro Koskinen wrote:
On Mon, May 23, 2016 at 09:21:22AM -0700, David Daney wrote:
On 05/23/2016 08:20 AM, Ralf Baechle wrote:
On Mon, May 23, 2016 at 06:13:46PM +0300, Aaro Koskinen wrote:
I'm getting kernel crashes (see below) reliably when building Perl in
parallel (make -j16) on OCTEON EBH5600 board (8 cores, 4 GB RAM) with
Linux 4.6.

It seems that CONFIG_TRANSPARENT_HUGEPAGE has something to do with the
issue - disabling it makes build go through fine.

Any ideas?

I thought it was working except on SGI Origin 200/2000 aka IP27 where
Joshua Kinard (added to cc) was hitting issues as well.

Joshua, does that similar to the issues you were hitting?

There is nothing OCTEON specific in the THP code, or huge pages in general.

That said, we have seen other THP related failures, and have never been able
to find the cause.

If someone can come up with a reproducible test case that triggers quickly,
we can run it in our simulator and easily find the problem.

Trying to build Perl is a reliable reproducer. Is that too heavyweight
for your simulator?

I was able to reproduce this also on EdgeRouter Pro, but there the kernel
does not fail, only compiler dies with SIGBUS:

[  315.095264] Data bus error, epc == 0000000000a801c4, ra == 0000000000a80624

And without THP the build is fine.

I also tried CN68XX board with 16 GB RAM and also there I get SIGBUS failure
instead of Machine Check.


Yes. I think the problem is some sort of corruption of the page tables. This may show up as MachineCheck Errors, or bus errors, or SIGSEGV.

David.





[Index of Archives]     [Linux MIPS Home]     [LKML Archive]     [Linux ARM Kernel]     [Linux ARM]     [Linux]     [Git]     [Yosemite News]     [Linux SCSI]     [Linux Hams]

  Powered by Linux