On 04/07/2016 06:24 PM, Nishanth Menon wrote: > On 04/07/2016 10:17 AM, Franklin S Cooper Jr. wrote: >> Hi All, >> >> While testing patches to fix NFS not working on linux-next I ran into to >> a bunch of kernel dumps when booting my 437-gp evm via NFS. >> >> Bootlog: http://pastebin.ubuntu.com/15670741/ >> Another Bootlog: http://pastebin.ubuntu.com/15671302/ >> >> I'm pretty sure it isn't related to the NFS issue. I dropped the local >> NFS fixes and reverted the patches that originally caused the NFS >> problem and I'm still able to see these kernel dumps. Now this doesn't >> occur all the time. Originally I was able to see this issue 1 out of 5 >> times but recently it takes over 10 times for this problem to occur. >> >> Also I am unable to reproduce this issue on my AM335x GP evm. I've >> rebooted the board 10 times and haven't seen any kernel dumps. >> > https://github.com/nmenon/kernel-test-logs/blob/next-20160406/multi_v7_defconfig/am43xx-gpevm.txt > looked good to me at least.. but then it was reported as intermittent.. > > Could be > http://lists.infradead.org/pipermail/linux-arm-kernel/2015-October/381174.html > > The guilty patch is 55ee701 "arm: omap2: board-generic: use omap4_local_timer_init for AM437x" which changes clocksource from gptimer to 32k_counter. You see this "INFO: rcu_sched self-detected stall on CPU", because sometime during boot wrong value is read from 32k_counter, which cause invalid jiffies calculation. Personally, I think that it could be errata: am437x: "Advisory 22 Watchdog Timers: Delay Needed to Read Some WDTimer Registers After Wakeup" ompa4460: i103 "1.1 Delay needed to read some GP timer, WD timer and sync timer registers after wakeup" also, you can check it with below diff: diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index bfbf284..0858077 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -1053,10 +1053,9 @@ static size_t print_time(u64 ts, char *buf) rem_nsec = do_div(ts, 1000000000); if (!buf) - return snprintf(NULL, 0, "[%5lu.000000] ", (unsigned long)ts); + return snprintf(NULL, 0, "[%5lu] ", (unsigned long)jiffies); - return sprintf(buf, "[%5lu.%06lu] ", - (unsigned long)ts, rem_nsec / 1000); + return sprintf(buf, "[%lu] ", (unsigned long)jiffies); } with above change you could see smth. like this in log: [4294938600] ============= cycle_now19816771 delta4294959581 tkr->cycle_last19824486 jiffies4294938600 12E6143 [4294938600] ============= cycle_now19817092 delta4294959902 tkr->cycle_last19824486 jiffies4294938600 12E6284 [4294938600] ============= cycle_now19817413 delta4294960223 tkr->cycle_last19824486 jiffies4294938600 12E63C5 ^^^ good jiffies [13078482] BUG: ================================== now19817734 cycle_last4314784734 real_tk19824486 ^^^ bad jiffies <-- 7-8 second of kernel boot [13078482] INFO: rcu_preempt self-detected stall on CPU -- regards, -grygorii -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html