Re: IP30: SMP Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/19/2014 10:22, Ralf Baechle wrote:
> On Wed, Nov 19, 2014 at 12:06:59PM +0000, Maciej W. Rozycki wrote:
>>  I highly doubt spinlocks have any significance here, they're used and 
>> work just fine across many systems.  If anything this will probably be 
>> either a bug in platform code somewhere or a critical part of hardware 
>> having not been correctly initialised.
> 
> For testing purposes one could disable the secondary CPU - I believe that's
> possible on IP30, too?  Then build an SMP kernel with NR_CPUS set to 1.
> That's basically a glorified uniprocessor kernel on glorified uniprocessor
> hardware then.

That's what I did, actually.  "disable 1" in the PROM and a reset to turn off
the extra CPU.  The register dump in my earlier mail in this thread is from an
SMP kernel running with CPU1 disabled in PROM.  NR_CPUS=2 in that build,
though, but Octane has some data available via MP_CONF registers that can tell
you if a CPU is online or offline, and the SMP setup code uses that to
enumerate the possible CPUs.

I can still trigger the corrupted memory addresses in that situation.  Which is
why I'm thinking there is something else, possibly not in the core IP30 code,
that's causing the problem.  I've stripped the test kernel of everything else
(no block drivers, no SCSI, no networking, etc).  I guess I can remove PCI &
IOC3 support next and see if loading an initramfs triggers the memory
corruption.  Might have to look closer at the memory probing code, too.


>>  Does the system have any standard bus like PCI?  If so then you could get 
>> an NVRAM card then and log some activity there like CPU status on entry to 
>> exception handlers.  Once a crash has happened you could boot with that 
>> logging disabled and retrieve your data.  Accessing hardware is easy on 
>> MIPS, you can do it via XKPHYS without a need to have the MMU working, IOW 
>> you'll be able to poke at hardware even if your TLB/page tables got 
>> botched for some reason.  And you can bypass the cache too, which is 
>> another possible place for breakage.
> 
> IP27 reserves a part of its FLASH memory for logging.  However Linux doesn't
> support that.

I did add a different RTC to my Octane, DS17887, which has 8KB of NVRAM
available.  The driver I wrote for it can access that NVRAM, too.  Uses PIO to
write an address to a port register and then reads a data register to get data
in/out from the RTC (unlike O2, which can ioremap the RTC registers directly).
 Can't store much in 8KB, though.  I'll look for an NVRAM card in that case then.

Is it possible for Linux, upon a kernel crash, to actually create a full dump
of all available memory and write that out somewhere?  NetWare had this
capability to designate a spare volume as a crash volume, which a core dump
could be written to using low-level access, and then you could do offline
analysis of the dump via the NW kernel debugger on a separate workstation
(after rebooting and copying the crash dump out).


>>  Of course if you have PCI then you can add an ordinary serial port card 
>> there as well if the onboard port is difficult to access for some reason, 
>> but serial port logging has its limitations, mainly the complexity to 
>> access it and throughput.
> 
> While the system has 16550 UARTs a PCI card might indeed make things
> slightly sinpler - the setup of the IOC3 + SuperIO combo is complex,
> even for a simple PIO driver.  However I think he shouldn't have to go
> to such an extreme!

Already got a PCI serial card installed in the PCI 'shoebox' module.
Unfortunately, it's a Moschip, so the driver for that is part of the parallel
port code, and that doesn't appear to be big-endian safe.  Last time I tried
the driver, it crashed when probing for the card.  Seems most of the cheap PCI
Serial card these days are Moschips.

--J


-- 
Joshua Kinard
Gentoo/MIPS
kumba@xxxxxxxxxx
4096R/D25D95E3 2011-03-28

"The past tempts us, the present confuses us, the future frightens us.  And our
lives slip away, moment by moment, lost in that vast, terrible in-between."

--Emperor Turhan, Centauri Republic





[Index of Archives]     [Linux MIPS Home]     [LKML Archive]     [Linux ARM Kernel]     [Linux ARM]     [Linux]     [Git]     [Yosemite News]     [Linux SCSI]     [Linux Hams]

  Powered by Linux