On Sat, 5 Oct 2019 20:10:47 -0400 Qian Cai <cai@xxxxxx> wrote: > > > >> the existing dependency chain (in reverse order) is: > >> > >> -> #2 (&(&zone->lock)->rlock){-.-.}: > >> lock_acquire+0x21a/0x468 > >> _raw_spin_lock+0x54/0x68 > >> get_page_from_freelist+0x8b6/0x2d28 > >> __alloc_pages_nodemask+0x246/0x658 > >> __get_free_pages+0x34/0x78 > >> sclp_init+0x106/0x690 > >> sclp_register+0x2e/0x248 > >> sclp_rw_init+0x4a/0x70 > >> sclp_console_init+0x4a/0x1b8 > >> console_init+0x2c8/0x410 > >> start_kernel+0x530/0x6a0 > >> startup_continue+0x70/0xd0 > > > > This appears to be the core of our problem? > > No, that is just one of those many places could form the lock chain. > > console_lock -> other locks -> zone_lock > > Another example is, > > https://lore.kernel.org/lkml/1568823006.5576.178.camel@xxxxxx/ There is no "console_lock". Please be much more specific. > It is easier to avoid, > > zone_lock -> console_lock > > rather than fixing the opposite. "ease" isn't the main objective. A more important question is "what makes sense". We should be able to call printk() from anywhere, any time under any conditions. That can't be done 100% but it is the objective. printk() should be robust and not being able to call printk() while holding zone->lock isn't robust! btw, this: : It is unsafe to call printk() while zone->lock was held, i.e., : : zone->lock --> console_sem doesn't make a lot of sense. console_sem is a sleeping lock so attempting to acquire it (with down()!) under spinlock is a huge bug. Again, please be careful with the descriptions.