Hi, On Tue, May 17, 2016 at 4:07 AM, Dann Frazier <dann.frazier@xxxxxxxxxxxxx> wrote: > Hi, > I'm observing a soft lockup issue w/ the ASPEED controller on an > arm64 server platform. This was originally seen on Ubuntu's 4.4 > kernel, but it is reproducible w/ vanilla 4.6-rc7 as well. > > [ 32.792656] NMI watchdog: BUG: soft lockup - CPU#38 stuck for 22s! > [swapper/38:0] > > I observe this just once each time I boot into debian-installer (I'm > using a serial console, but the ast module gets loaded during > startup). I have figured out that it is caused by 'mod_timer(timer, jiffies)' and 'ops->cur_blink_jiffies' is observed as zero in cursor_timer_handler() when the issue happened. Looks it is a real fbcon/vt issue, see following: fbcon_init() <-.con_init <-visual_init() reset_terminal() <-vc_init() vc->vc_cur_blink_ms is just set in reset_terminal() from vc_init() path, and ops->cur_blink_jiffies is figured out from vc->vc_cur_blink_ms in fbcon_init(). And visual_init() is always run before vc_init(), so ops->cur_blink_jiffies is initialized as zero and cause the soft lockup issue finally. Thanks, Ming > > perf shows that the CPU caught by the NMI is typically in code > updating the cursor timer: > > - 16.92% swapper [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore > - _raw_spin_unlock_irqrestore > + 16.87% mod_timer > + 0.05% cursor_timer_handler > - 12.15% swapper [kernel.kallsyms] [k] queue_work_on > - queue_work_on > + 12.00% cursor_timer_handler > + 0.15% call_timer_fn > + 10.98% swapper [kernel.kallsyms] [k] run_timer_softirq > - 2.23% swapper [kernel.kallsyms] [k] mod_timer > - mod_timer > + 1.97% cursor_timer_handler > + 0.26% call_timer_fn > > During the same period, I can see that another CPU is actively > executing the timer function: > > - 42.18% kworker/u96:2 [kernel.kallsyms] [k] ww_mutex_unlock > - ww_mutex_unlock > - 40.70% ast_dirty_update > ast_imageblit > soft_cursor > bit_cursor > fb_flashcursor > process_one_work > worker_thread > kthread > ret_from_fork > + 1.48% ast_imageblit > - 40.15% kworker/u96:2 [kernel.kallsyms] [k] __memcpy_toio > - __memcpy_toio > + 31.54% ast_dirty_update > + 8.61% ast_imageblit > > Using the graph function tracer on fb_flashcursor(), I see that > ast_dirty_update usually takes around 60 us, in which it makes 16 > calls to __memcpy_toio(). However, there is always one instance on > every boot of the installer where ast_dirty_update() takes ~98 *ms* to > complete, during which it makes 743 calls to __memcpy_toio(). While > that doesn't directly account for the full 22s, I do wonder if that > maybe a smoking gun. > > fyi, this is being tracked at: https://bugs.launchpad.net/bugs/1574814 > > -dann _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel