Re: Please help! AM35xx mm/slab.c BUG

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 6, 2012 at 11:44 AM, CF Adad <cfadad@xxxxxxxxxxxxxx> wrote:
> All,
>
>
> We've learned a few more things:
>
> 1.) We have found a way to get it to happen pretty consistently.  We simply run iperf in a loop using the EMAC port to some other device.
>
>
> 2.) The crash ONLY happens on our custom board, not on the Twister dev kit.  This is true despite the fact that I ported our latest linux-omap 3.4-rc6 over there.  We're still running Technexion's default x-loader and u-boot to handle proper configs on that board. So, that's a substantial bit of code that is different between our boxes.  The kernel is altered only in that the few pinmux changes I left in Linux have been removed to avoid configuration differences between the two boards.
>
>
> This suggests that either:
> A) We have a hardware problem on our board.  Seems unlikely.  Can anyone think of anything hardware related that would manifest itself with these sorts of errors?
>
>
> B) We have a issue in our bootloader code somehwere.  I hesitated to overwrite the bootloaders for this test on the Twister baseboard just because I did not want to have to mess with getting the pinmux's and the like put back and such.
>
> Presuming something in those bootloaders is our problem, I wonder what EMAC-related stuff there really is.  For a long time we ran with our bootloaders NOT initializing either of the Eths.  This was Technexion's default.  They left that work to Linux.  We've recently done work to enable them in u-boot, but we were crashing like this long before that.  Once in Linux, we're just using the standard drivers and calls from within the board file to SMSC911x and the Davinci EMAC drivers.  I am using the patches that allow the e-fused MAC to be pulled from the AM35xx for the EMAC, but I can't see how that would cause this.
>
> Assuming the EMAC is perhaps an innocent bystander that happens just to cause this, the place I would have to suspect the most in our bootloaders would be the GPMC settings.  We've done a good bit of tweaking in there since we switched chips.  *Could a GPMC timing issue account for these types of errors???*  The reason I bring it up is that the GPMC has been one of those things that we've really struggled to understand.  What should the timings *really* be?  We've done the best we can to try to guess our way through it.  BUT, we could certainly be very wrong.  If a GPMC setting could cause these types of bugs, please let me know.  I'll be happy to post more info on how we're setting that up now.  In case not, I'll save the electrons and not spam it here.
>
>
I don't know the AMXX architecture that well but looking at the
crash-log, am not sure GPMC should play in role here.
What I think is, it is mostly memory corruption and can be caused by
many reasons as Tony outlined.

To ensure that, your memory is in good state, can you run memtester
for long duration and see that you
are not getting any memory failures. Try to give the maximum memory
size as a an input to memtester.

You can download one from [1]

Regards
Santosh
[1] http://pyropus.ca/software/memtester/
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Arm (vger)]     [ARM Kernel]     [ARM MSM]     [Linux Tegra]     [Linux WPAN Networking]     [Linux Wireless Networking]     [Maemo Users]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux