On Mon, 2019-10-07 at 17:12 +0200, Michal Hocko wrote: > On Mon 07-10-19 10:59:10, Qian Cai wrote: > [...] > > It is almost impossible to eliminate all the indirect call chains from > > console_sem/console_owner_lock to zone->lock because it is too normal that > > something later needs to allocate some memory dynamically, so as long as it > > directly call printk() with zone->lock held, it will be in trouble. > > Do you have any example where the console driver really _has_ to > allocate. Because I have hard time to believe this is going to work at > all as the atomic context doesn't allow to do any memory reclaim and > such an allocation would be too easy to fail so the allocation cannot > really rely on it. I don't know how to explain to you clearly, but let me repeat again one last time. There is no necessary for console driver directly to allocate considering this example, CPU0: CPU1: CPU2: CPU3: console_sem->lock zone->lock pi->lock pi->lock rq_lock rq->lock zone->lock console_sem->lock Here it only need someone held the rq_lock and allocate some memory. There is also true for port_lock. Since the deadlock could involve a lot of CPUs and a longer lock chain, it is impossible to predict which one to allocate some memory while held a lock could end up with the same problematic lock chain. > > So again, crippling the MM code just because of lockdep false possitives > or a broken console driver sounds like a wrong way to approach the > problem. > > > [ 297.425964] -> #1 (&port_lock_key){-.-.}: > > [ 297.425967] __lock_acquire+0x5b3/0xb40 > > [ 297.425967] lock_acquire+0x126/0x280 > > [ 297.425968] _raw_spin_lock_irqsave+0x3a/0x50 > > [ 297.425969] serial8250_console_write+0x3e4/0x450 > > [ 297.425970] univ8250_console_write+0x4b/0x60 > > [ 297.425970] console_unlock+0x501/0x750 > > [ 297.425971] vprintk_emit+0x10d/0x340 > > [ 297.425972] vprintk_default+0x1f/0x30 > > [ 297.425972] vprintk_func+0x44/0xd4 > > [ 297.425973] printk+0x9f/0xc5 > > [ 297.425974] register_console+0x39c/0x520 > > [ 297.425975] univ8250_console_init+0x23/0x2d > > [ 297.425975] console_init+0x338/0x4cd > > [ 297.425976] start_kernel+0x534/0x724 > > [ 297.425977] x86_64_start_reservations+0x24/0x26 > > [ 297.425977] x86_64_start_kernel+0xf4/0xfb > > [ 297.425978] secondary_startup_64+0xb6/0xc0 > > This is an early init code again so the lockdep sounds like a false > possitive to me. This is just a tip of iceberg to show the lock dependency, console_owner --> port_lock_key which could easily happen everywhere with a simple printk().