> On 10/21/2013 04:58 AM, Meelis Roos wrote: > > > Somwehere between 3.11.0 and 3.12-rc2, my U5-360 has consistently been > > > >hanging on reboot. Today I connected a serial cable and learned about a > > > >RED state exception. 3.10.0 and 3.11.0 are OK, 3.12-rc2 and later hang > > > >reliably. I have not yet started bisecting since this will need remote > > > >power cycle setup. > > Another data point: the same problem happens on Sun Blade 100 with ALI > > IDE. Does not happen on Fire V100 and Netra X1 that are also ALI IDE > > based. The configs may be different too of course. > > > > I did a bisect for full tree. It landed into tty commits, some of them > > being untestable without a compile fix > > Hi Meelis, > > What tty commits required a compile fix? It appears I did not save the bisect log. But there errors were about unknown vmalloc and some other memory related symbol (always the same 2 symbols), maybe some missing include problem. Since I could skip them and still get a bisect, I did not try to fix them. > > but it came out clearly finally > > (each bad commit was clearly bad, each good commit was tested for 3 > > reboots without a problem). Bisect resulted in his commit being at > > fault: > > > > 8cb06c983822103da1cfe57b9901e60a00e61f67 is the first bad commit > > commit 8cb06c983822103da1cfe57b9901e60a00e61f67 > > Author: Peter Hurley<peter@xxxxxxxxxxxxxxxxxx> > > Date: Sat Jun 15 10:21:18 2013 -0400 > > > > n_tty: Remove alias ptrs in __receive_buf() > > > > The char and flag buffer local alias pointers, p and f, are > > unnecessary; remove them. > > > > Signed-off-by: Peter Hurley<peter@xxxxxxxxxxxxxxxxxx> > > Signed-off-by: Greg Kroah-Hartman<gregkh@xxxxxxxxxxxxxxxxxxx> > > > > :040000 040000 ddc901fe810f43bc06a64397735b469b11e403e8 > > 96d92e4e242c4b2ff11b25c005bccd093865b350 M drivers > > > > Reading the commit suggests that commit is not at fault - it seems so > > unrelated. It just modifies on-stack function parameters instead of > > local copies. > > As you note, this is an unlikely culprit. Does a repeat bisect from > different good/bad starts give the same result? Will try. > > > >TL=0000.0000.0000.0001 TT=0000.0000.0000.0064 > > > > TPC=0000.0000.f004.55c0 TnPC=0000.0000.f004.55c4 > > > TSTATE=0000.0099.1100.1602 > > > > > > > >Trap Type 0x64 seems to fast_instruction_access_MMU_miss. It keeps > > > >trapping until 5 levels deep. The first one is from f00455c0 that may > > > >be the System.map entry > > Is any of the above exception information useful in diagnosing this? I have not found any help from it yet. I looked at the reboot codepath and commented out kmsg_dump(KMSG_DUMP_RESTART); from kernel_restart() but that changed nothing. It goes into machine_restart(), that one just calls prom_reboot() and somewhere there the faults happen. Normally the PROM prints out "Resetting ..." but not now. -- Meelis Roos (mroos@xxxxxxxx) -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html