Re: MIPS/CI20: BUG: Bad page state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Wed, May 29, 2019 at 02:37:15AM +0300, Aaro Koskinen wrote:
> On Wed, Apr 24, 2019 at 08:50:31PM +0000, Paul Burton wrote:
> > On Wed, Apr 24, 2019 at 11:40:55PM +0300, Aaro Koskinen wrote:
> > > On Wed, Apr 24, 2019 at 07:29:29PM +0000, Paul Burton wrote:
> > > > On Wed, Apr 24, 2019 at 09:20:12PM +0300, Aaro Koskinen wrote:
> > > > > I have been trying to get GCC bootstrap to pass on CI20 board, but it
> > > > > seems to always crash. Today, I finally got around connecting the serial
> > > > > console to see why, and it logged the below BUG.
> > > > > 
> > > > > I wonder if this is an actual bug, or is the hardware faulty?
> > > > > 
> > > > > FWIW, this is 32-bit board with 1 GB RAM. The rootfs is on MMC, as well
> > > > > as 2 GB + 2 GB swap files.
> > > > > 
> > > > > Kernel config is at the end of the mail.
> > > > 
> > > > I'd bet on memory corruption, though not necessarily faulty hardware.
> > > > 
> > > > Unfortunately memory corruption on Ci20 boards isn't uncommon... Someone
> > > > did make some tweaks to memory timings configured in the DDR controller
> > > > which improved things for them a while ago:
> > > > 
> > > >   https://github.com/MIPS/CI20_u-boot/pull/18
> > > > 
> > > > Would you be up for testing with those tweaks? I'd be happy to help with
> > > > updating U-Boot if needed.
> 
> I did some testing with CI20_u-boot ef995a1611f0, plus the timing fix
> cherry picked. Didn't help, I still get random crashes (every time
> different).

I have now ran memtester with 900M allocation for 10 hours (around 10
loops), then with two processes using 450M allocation each for 24 hours
(some 20 loops or so), and no errors or other issues are encountered.
I would guess if the timings were wrong, memtester would have failed
by now?

When trying GCC bootstrap the systems fails reliably... Usually within
few hours, but sometimes even within 30 minutes.

Maybe the issue is not memory/hardware. Since I build, and have also
swap, on MMC/SDcard perhaps we have some buggy code in the MMC or DMA
driver that results in memory corruption?

A.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux