Re: RED state exception (trap type 0x64) on U5 reboot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On 10/21/2013 04:58 AM, Meelis Roos wrote:
> > > Somwehere between 3.11.0 and 3.12-rc2, my U5-360 has consistently been
> > > >hanging on reboot. Today I connected a serial cable and learned about a
> > > >RED state exception. 3.10.0 and 3.11.0 are OK, 3.12-rc2 and later hang
> > > >reliably. I have not yet started bisecting since this will need remote
> > > >power cycle setup.
> > Another data point: the same problem happens on Sun Blade 100 with ALI
> > IDE. Does not happen on Fire V100 and Netra X1 that are also ALI IDE
> > based. The configs may be different too of course.
> > 
> > I did a bisect for full tree. It landed into tty commits, some of them
> > being untestable without a compile fix
> 
> Hi Meelis,
> 
> What tty commits required a compile fix?

It appears I did not save the bisect log. But there errors were about 
unknown vmalloc and some other memory related symbol (always the same 2 
symbols), maybe some missing include problem. Since I could skip them 
and still get a bisect, I did not try to fix them.

> > but it came out clearly finally
> > (each bad commit was clearly bad, each good commit was tested for 3
> > reboots without a problem). Bisect resulted in his commit being at
> > fault:
> > 
> > 8cb06c983822103da1cfe57b9901e60a00e61f67 is the first bad commit
> > commit 8cb06c983822103da1cfe57b9901e60a00e61f67
> > Author: Peter Hurley<peter@xxxxxxxxxxxxxxxxxx>
> > Date:   Sat Jun 15 10:21:18 2013 -0400
> > 
> >      n_tty: Remove alias ptrs in __receive_buf()
> > 
> >      The char and flag buffer local alias pointers, p and f, are
> >      unnecessary; remove them.
> > 
> >      Signed-off-by: Peter Hurley<peter@xxxxxxxxxxxxxxxxxx>
> >      Signed-off-by: Greg Kroah-Hartman<gregkh@xxxxxxxxxxxxxxxxxxx>
> > 
> > :040000 040000 ddc901fe810f43bc06a64397735b469b11e403e8
> > 96d92e4e242c4b2ff11b25c005bccd093865b350 M  drivers
> > 
> > Reading the commit suggests that commit is not at fault - it seems so
> > unrelated. It just modifies on-stack function parameters instead of
> > local copies.
> 
> As you note, this is an unlikely culprit. Does a repeat bisect from
> different good/bad starts give the same result?

Will try.

> > > >TL=0000.0000.0000.0001 TT=0000.0000.0000.0064
> > > >    TPC=0000.0000.f004.55c0 TnPC=0000.0000.f004.55c4
> > > TSTATE=0000.0099.1100.1602
> > > >
> > > >Trap Type 0x64 seems to fast_instruction_access_MMU_miss. It keeps
> > > >trapping until 5 levels deep. The first one is from f00455c0 that may
> > > >be the System.map entry
> 
> Is any of the above exception information useful in diagnosing this?

I have not found any help from it yet.

I looked at the reboot codepath and commented out 
kmsg_dump(KMSG_DUMP_RESTART);
from kernel_restart() but that changed nothing. It goes into 
machine_restart(), that one just calls prom_reboot() and somewhere there 
the faults happen. Normally the PROM prints out "Resetting ..." but not 
now.

-- 
Meelis Roos (mroos@xxxxxxxx)
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Kernel Development]     [DCCP]     [Linux ARM Development]     [Linux]     [Photo]     [Yosemite Help]     [Linux ARM Kernel]     [Linux SCSI]     [Linux x86_64]     [Linux Hams]

  Powered by Linux