Re: [PATCH v2] mm/page_isolation: fix a deadlock with printk()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2019-10-07 at 17:12 +0200, Michal Hocko wrote:
> On Mon 07-10-19 10:59:10, Qian Cai wrote:
> [...]
> > It is almost impossible to eliminate all the indirect call chains from
> > console_sem/console_owner_lock to zone->lock because it is too normal that
> > something later needs to allocate some memory dynamically, so as long as it
> > directly call printk() with zone->lock held, it will be in trouble.
> 
> Do you have any example where the console driver really _has_ to
> allocate. Because I have hard time to believe this is going to work at
> all as the atomic context doesn't allow to do any memory reclaim and
> such an allocation would be too easy to fail so the allocation cannot
> really rely on it.

I don't know how to explain to you clearly, but let me repeat again one last
time. There is no necessary for console driver directly to allocate considering
this example,

CPU0:              CPU1:    CPU2:       CPU3:
console_sem->lock                       zone->lock
                   pi->lock
pi->lock                    rq_lock
                   rq->lock
                            zone->lock
                                        console_sem->lock

Here it only need someone held the rq_lock and allocate some memory. There is
also true for port_lock. Since the deadlock could involve a lot of CPUs and a
longer lock chain, it is impossible to predict which one to allocate some memory
while held a lock could end up with the same problematic lock chain.

> 
> So again, crippling the MM code just because of lockdep false possitives
> or a broken console driver sounds like a wrong way to approach the
> problem.
> 
> > [  297.425964] -> #1 (&port_lock_key){-.-.}:
> > [  297.425967]        __lock_acquire+0x5b3/0xb40
> > [  297.425967]        lock_acquire+0x126/0x280
> > [  297.425968]        _raw_spin_lock_irqsave+0x3a/0x50
> > [  297.425969]        serial8250_console_write+0x3e4/0x450
> > [  297.425970]        univ8250_console_write+0x4b/0x60
> > [  297.425970]        console_unlock+0x501/0x750
> > [  297.425971]        vprintk_emit+0x10d/0x340
> > [  297.425972]        vprintk_default+0x1f/0x30
> > [  297.425972]        vprintk_func+0x44/0xd4
> > [  297.425973]        printk+0x9f/0xc5
> > [  297.425974]        register_console+0x39c/0x520
> > [  297.425975]        univ8250_console_init+0x23/0x2d
> > [  297.425975]        console_init+0x338/0x4cd
> > [  297.425976]        start_kernel+0x534/0x724
> > [  297.425977]        x86_64_start_reservations+0x24/0x26
> > [  297.425977]        x86_64_start_kernel+0xf4/0xfb
> > [  297.425978]        secondary_startup_64+0xb6/0xc0
> 
> This is an early init code again so the lockdep sounds like a false
> possitive to me.

This is just a tip of iceberg to show the lock dependency,

console_owner --> port_lock_key

which could easily happen everywhere with a simple printk().





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux