> -----邮件原件----- > 发件人: linux-kernel-owner@xxxxxxxxxxxxxxx > [mailto:linux-kernel-owner@xxxxxxxxxxxxxxx] 代表 Jiri Slaby > 发送时间: 2019年4月17日 16:59 > 收件人: gregkh@xxxxxxxxxxxxxxxxxxx > 抄送: linux-serial@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Jiri Slaby > <jslaby@xxxxxxx>; Li,Rongqing <lirongqing@xxxxxxxxx>; Wang,Li(ACG Cloud) > <wangli39@xxxxxxxxx>; Zhang,Yu(ACG Cloud) <zhangyu31@xxxxxxxxx>; > stable <stable@xxxxxxxxxxxxxxx> > 主题: [PATCH] TTY: serial_core, add ->install > > We need to compute the uart state only on the first open. This is usually what is > done in the ->install hook. serial_core used to do this in ->open on every open. > So move it to ->install. > > As a side effect, it ensures the state is set properly in the window after > tty_init_dev is called, but before uart_open. This fixes a bunch of races > between tty_open and flush_to_ldisc we were dealing with recently. > > One of such bugs was attempted to fix in commit fedb5760648a (serial: > fix race between flush_to_ldisc and tty_open), but it only took care of a couple > of functions (uart_start and uart_unthrottle). I was able to reproduce the > crash on a SLE system, but in uart_write_room which is also called from > flush_to_ldisc via process_echoes. I was *unable* to reproduce the bug locally. > It is due to having this patch in my queue since 2012! > > general protection fault: 0000 [#1] SMP KASAN PTI > CPU: 1 PID: 5 Comm: kworker/u4:0 Tainted: G L > 4.12.14-396-default #1 SLE15-SP1 (unreleased) > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.12.0-0-ga698c89-prebuilt.qemu.org 04/01/2014 > Workqueue: events_unbound flush_to_ldisc > task: ffff8800427d8040 task.stack: ffff8800427f0000 > RIP: 0010:uart_write_room+0xc4/0x590 > RSP: 0018:ffff8800427f7088 EFLAGS: 00010202 > RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000 > RDX: 000000000000002f RSI: 00000000000000ee RDI: ffff88003888bd90 > RBP: ffffffffb9545850 R08: 0000000000000001 R09: 0000000000000400 > R10: ffff8800427d825c R11: 000000000000006e R12: 1ffff100084fee12 > R13: ffffc900004c5000 R14: ffff88003888bb28 R15: 0000000000000178 > FS: 0000000000000000(0000) GS:ffff880043300000(0000) > knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000561da0794148 CR3: 000000000ebf4000 CR4: 00000000000006e0 > Call Trace: > tty_write_room+0x6d/0xc0 > __process_echoes+0x55/0x870 > n_tty_receive_buf_common+0x105e/0x26d0 > tty_ldisc_receive_buf+0xb7/0x1c0 > tty_port_default_receive_buf+0x107/0x180 > flush_to_ldisc+0x35d/0x5c0 > ... > > 0 in rbx means tty->driver_data is NULL in uart_write_room. 0x178 is tried to > be dereferenced (0x178 >> 3 is 0x2f in rdx) at uart_write_room+0xc4. 0x178 is > exactly (struct uart_state *)NULL->refcount used in uart_port_lock from > uart_write_room. > > So revert the upstream commit here as my local patch should fix the whole > family. > > Signed-off-by: Jiri Slaby <jslaby@xxxxxxx> > Cc: Li RongQing <lirongqing@xxxxxxxxx> > Cc: Wang Li <wangli39@xxxxxxxxx> > Cc: Zhang Yu <zhangyu31@xxxxxxxxx> > Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> > Cc: stable <stable@xxxxxxxxxxxxxxx> > --- > > ============================= NOTE ============================= > > Could you test your use-case at Baidu, guys, please? > Sorry, we have not the environment to test it, it happens when we upgrades BMC -RongQing