On (11/22/18 14:57), Waiman Long wrote: > > [..] > >> As a side note, one of the test systems that I used generated a > >> debugobjects splat in the bootup process and the system hanged > >> afterward. Applying this patch alone fix the hanging problem and the > >> system booted up successfully. So it is not really a good idea to call > >> printk() while holding a raw spinlock. > > Right, I like this patch. > > And I think that we, maybe, can go even further. > > > > Some serial consoles call mod_timer(). So what we could have with the > > debug objects enabled was > > > > mod_timer() > > lock_timer_base() > > debug_activate() > > printk() > > call_console_drivers() > > foo_console() > > mod_timer() > > lock_timer_base() << deadlock > > > > That's one possible scenario. The other one can involve console's > > IRQ handler, uart port spinlock, mod_timer, debug objects, printk, > > and an eventual deadlock on the uart port spinlock. This one can > > be mitigated with printk_safe. But mod_timer() deadlock will require > > a different fix. > > > > So maybe we need to switch debug objects print-outs to _always_ > > printk_deferred(). Debug objects can be used in code which cannot > > do direct printk() - timekeeping is just one example. > > Actually, I don't think that was the cause of the hang. Oh, I didn't suggest that this was the case. Just talked about more problems with printk in debug objects. Serial consoles call mod_time, mod_timer calls debug objects, debug objects call printk and end up in serial console again. Serial consoles are not re-entrant at this point. > The debugobjects splat was caused by debug_object_is_on_stack(), below > was the output: > > [ 6.890048] ODEBUG: object (____ptrval____) is NOT on stack > (____ptrval____), but annotated. > [ 6.891000] WARNING: CPU: 28 PID: 1 at lib/debugobjects.c:369 > __debug_object_init.cold.11+0x51/0x2d6 [..] > 11.270654] systemd[1]: systemd 239 running in system mode. (+PAM > +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP > +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN > +PCRE2 default-hierarchy=legacy) > [ 11.311307] systemd[1]: Detected architecture x86-64. > [ 11.316420] systemd[1]: Running in initial RAM disk. > > Welcome to > > The system is not responsive at this point. > > I am not totally sure what caused this. Hmm, me neither. -ss