Re: Recursive/circular locking in serial8250_console_write/serial8250_do_startup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 12, 2020 at 08:48:13AM -0700, Guenter Roeck wrote:
> Hi,
> 
> crbug.com/1114800 reports a hard lockup due to circular locking in the
> 8250 console driver. This is seen if CONFIG_PROVE_LOCKING is enabled.
> 
> Problem is as follows:
> - serial8250_do_startup() locks the serial (console) port.
> - serial8250_do_startup() then disables interrupts if interrupts are
>   shared, by calling disable_irq_nosync().
> - disable_irq_nosync() calls __irq_get_desc_lock() to lock the interrupt
>   descriptor.
> - __irq_get_desc_lock() calls lock_acquire()
> - If CONFIG_PROVE_LOCKING is enabled, validate_chain() and check_noncircular()
>   are called and identify a potential locking error.
> - This locking error is reported via printk, which ultimately calls
>   serial8250_console_write().
> - serial8250_console_write() tries to lock the serial console port.
>   Since it is already locked, the system hangs and ultimately reports
>   a hard lockup.
> 
> I understand we'll need to figure out and fix what lockdep complains about,
> and I am working on that. However, even if that is fixed, we'll need a
> solution for the recursive lock: Fixing the lockdep problem doesn't
> guarantee that a similar problem (or some other log message) won't be
> detected and reported sometime in the future while serial8250_do_startup()
> holds the console port lock.
> 
> Ideas, anyone ? Everything I came up with so far seems clumsy and hackish.
> 

Turns out the situation is a bit worse than I thought. disable_irq_nosync(),
when called from serial8250_do_startup(), locks the interrupt descriptor.
The order of locking is
	serial port lock
	  interrupt descriptor lock

At the same time, __setup_irq() locks the interrupt descriptor as well.
With the descriptor locked, it may report an error message using pr_err().
This in turn may call serial8250_console_write(), which will try to lock
the console serial port. The lock sequence is
	interrupt descriptor lock
	  serial port lock

I added the lockdep splat to the bug log at crbug.com/1114800.

Effectively, I think, this means we can't call disable_irq_nosync()
while holding a serial port lock, or at least not while holding a
serial port lock that is associated with a console.

The problem was introduced (or, rather, exposed) with upstream commit
7febbcbc48fc ("serial: 8250: Check UPF_IRQ_SHARED in advance").

Guenter



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux PPP]     [Linux FS]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Linmodem]     [Device Mapper]     [Linux Kernel for ARM]

  Powered by Linux