Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thomas,

can you test CONFIG_TRANSPARENT_HUGEPAGE on an IP28?

All in all the R10000's TLB is unproblematic; my gut feeling is that
rather something else specific to IP27 is spoiling the broth.

  Ralf

On Mon, Nov 10, 2014 at 02:04:10AM -0500, Joshua Kinard wrote:
> Date:   Mon, 10 Nov 2014 02:04:10 -0500
> From: Joshua Kinard <kumba@xxxxxxxxxx>
> To: David Daney <ddaney.cavm@xxxxxxxxx>
> CC: Ralf Baechle <ralf@xxxxxxxxxxxxxx>, Linux MIPS List
>  <linux-mips@xxxxxxxxxxxxxx>
> Subject: Re: IP27: CONFIG_TRANSPARENT_HUGEPAGE triggers bus errors
> Content-Type: text/plain; charset=windows-1252
> 
> On 11/08/2014 19:09, Joshua Kinard wrote:
> > On 11/07/2014 13:30, David Daney wrote:
> >> On 11/07/2014 02:22 AM, Joshua Kinard wrote:
> >> [...]
> >>>
> >>> So my guess is unless hugepages can happen in powers of 4,
> >>
> >> Huge  pages are currently only supported on MIPS64 for this reason.
> >>
> >> huge_page_mask_size = (normal_page_size/8 * normal_page_size) / 2;
> >>
> >> If you take log2 of everything you get
> >>
> >> huge_page_mask_bits = normal_page_bits - 3 + normal_page_bits - 1
> >>   = 2 * normal_page_bits - 4 (always even)
> >>
> >> So all page sizes result in huge pages that meet the power of 4 criterion.
> > 
> > Well, looks like I'll have to bisect to hunt the problem down.  Obviously there
> > is something with transparent hugepages that the R10K-family dislikes.  Just a
> > question of "what?".  Seems like I'm the only one left with this kind of
> > equipment and interest to play with it :)
> 
> I gave up on bisecting this.  3.7 and 3.9 kernels are not bootable on my Onyx2
> w/o additional patches to fix the PCI probing code to deal with the card cage I
> have in my system (basically, it stops probing after it discovers the first PCI
> bus).  Even with that fixed, normal init refused to load on those kernels, and
> dash as init just outright crashed.  Must be some other IP27 bug that was fixed
> at some point, and I didn't feel like applying multiple patches to every bisect
> checkout, which might've altered results and led me to blaming the wrong commit.
> 
> It does look like the PageMask register is getting set to the correct values on
> PAGE_SIZE_4K and PAGE_SIZE_16K when a hugepage is needed (PM_1M and PM_16M).
> The PAGE_SIZE_64K case wouldn't be valid on R10k, as that uses PM_256M for a
> hugepage, which is bits 28:13 in PageMask and that would lead to "undefined
> behavior".  I'm assuming another register is getting set to an incorrect value
> in the huge pagecase (EntryLo0 or EntryLo1?  EntryHi?), but I don't have the
> required knowledge to fiddle w/ the TLB code to figure it out.
> 
> So, I sent in the patch that marks CPU_SUPPORTS_HUGEPAGES as BROKEN until
> someone feels like tackling it (if ever).
> 
> Sidenote: Is it possible to add additional CP0 registers to a register dump on
> a panic or oops?  I looked around ptrace.c and ptrace.h and see where these
> registers are setup and printed out, but I can't find out where the actual
> values are fetched from the CPU and put into struct pt_regs.  I am assuming
> it's a snippet of asm somewhere.  Adding R10K's PageMask, Config, ErrorEpc, And
> Context/XContext registers seems like useful debugging info.
> 
> -- 
> Joshua Kinard
> Gentoo/MIPS
> kumba@xxxxxxxxxx
> 4096R/D25D95E3 2011-03-28
> 
> "The past tempts us, the present confuses us, the future frightens us.  And our
> lives slip away, moment by moment, lost in that vast, terrible in-between."
> 
> --Emperor Turhan, Centauri Republic

  Ralf





[Index of Archives]     [Linux MIPS Home]     [LKML Archive]     [Linux ARM Kernel]     [Linux ARM]     [Linux]     [Git]     [Yosemite News]     [Linux SCSI]     [Linux Hams]

  Powered by Linux