Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> writes: > On 2018-07-09 11:55:08 [+0200], Esben Haabendal wrote: >> I am using the following patch. >> Not sure if it is worth proposing it for mainline inclusion, though. >> RFC: > > | ====================================================== > | WARNING: possible circular locking dependency detected > | 4.16.18-rt9+ #187 Not tainted > | ------------------------------------------------------ > | sshd/3205 is trying to acquire lock: > | (&buf->lock){+.+.}, at: [< (ptrval)>] flush_to_ldisc+0x1e/0xa0 > | > | but task is already holding lock: > | (&ldata->output_lock){+.+.}, at: [< (ptrval)>] n_tty_write+0x12a/0x480 > | > | which lock already depends on the new lock. > | > | > | the existing dependency chain (in reverse order) is: > | > | -> #2 (&ldata->output_lock){+.+.}: > | _mutex_lock+0x26/0x40 > | n_tty_write+0x12a/0x480 > | tty_write+0x1b3/0x320 > | redirected_tty_write+0x9a/0xb0 > | do_iter_write+0x159/0x1a0 > | vfs_writev+0x93/0x110 > | do_writev+0x5f/0xf0 > | SyS_writev+0xb/0x10 > | do_syscall_64+0x73/0x220 > | entry_SYSCALL_64_after_hwframe+0x42/0xb7 > | > | -> #1 (&tty->termios_rwsem){++++}: > | down_write+0x39/0x50 > | n_tty_flush_buffer+0x19/0xf0 > | tty_buffer_flush+0x71/0x90 > | tty_ldisc_flush+0x1d/0x40 > | tty_port_close_start.part.5+0xa0/0x1b0 > | tty_port_close+0x29/0x60 > | uart_close+0x26/0x70 > | tty_release+0xfc/0x4f0 > | __fput+0xf1/0x200 > | ____fput+0x9/0x10 > | task_work_run+0x8b/0xc0 > | exit_to_usermode_loop+0xbc/0xc0 > | do_syscall_64+0x21b/0x220 > | entry_SYSCALL_64_after_hwframe+0x42/0xb7 > | > | -> #0 (&buf->lock){+.+.}: > | lock_acquire+0x95/0x240 > | _mutex_lock+0x26/0x40 > | flush_to_ldisc+0x1e/0xa0 > | tty_flip_buffer_push+0x28/0x40 > | pty_write+0x4e/0x60 > | n_tty_write+0x1ae/0x480 > | tty_write+0x1b3/0x320 > | __vfs_write+0x35/0x160 > | vfs_write+0xc1/0x1c0 > | SyS_write+0x53/0xc0 > | do_syscall_64+0x73/0x220 > | entry_SYSCALL_64_after_hwframe+0x42/0xb7 > | > | other info that might help us debug this: > | > | Chain exists of: > | &buf->lock --> &tty->termios_rwsem --> &ldata->output_lock > | > | Possible unsafe locking scenario: > | > | CPU0 CPU1 > | ---- ---- > | lock(&ldata->output_lock); > | lock(&tty->termios_rwsem); > | lock(&ldata->output_lock); > | lock(&buf->lock); > | > | *** DEADLOCK *** > | > | 4 locks held by sshd/3205: > | #0: (&tty->ldisc_sem){++++}, at: [< (ptrval)>] ldsem_down_read+0x2d/0x40 > | #1: (&tty->atomic_write_lock){+.+.}, at: [< (ptrval)>] tty_write_lock+0x19/0x50 > | #2: (&o_tty->termios_rwsem/1){++++}, at: [< (ptrval)>] n_tty_write+0x9a/0x480 > | #3: (&ldata->output_lock){+.+.}, at: [< (ptrval)>] n_tty_write+0x12a/0x480 Yes, it seems like there is: 1. tty_buffer_flush(), which results in mutex_lock(&buf->lock); down_write(&tty->termios_rwsem); # in n_tty_flush_buffer() and with my patch: 2. n_tty_write(), which results in down_read(&tty->termios_rwsem); mutex_lock(&ldata->output_lock); mutex_lock(&buf->lock); # in flush_to_ldisc() (from tty_do_flip()) So both &buf->lock --> &tty->termios_rwsem and &tty->termios_rwsem --> &ldata->output_lock --> &buf->lock chains, which I guess is what lockdep refers to, and which indeed looks like a deadlock waiting to happen. But is the lock ordering in tty_buffer_flush() really needed? Could we move the if (ld && ld->ops->flush_buffer) ld->ops->flush_buffer(tty); lines out of the &buf->lock section? I don't think that any code outside of tty_buffer.c should touch the port->buf code, and thus should not need to hold the &buf->lock. /Esben -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html