> -----邮件原件----- > 发件人: Greg KH [mailto:gregkh@xxxxxxxxxxxxxxxxxxx] > 发送时间: 2019年1月30日 21:17 > 收件人: Li,Rongqing <lirongqing@xxxxxxxxx> > 抄送: jslaby@xxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; gkohli@xxxxxxxxxxxxxx; > linux-serial@xxxxxxxxxxxxxxx > 主题: Re: 答复: [PATCH][v4] tty: fix race between flush_to_ldisc and tty_open > > On Wed, Jan 30, 2019 at 12:48:42PM +0000, Li,Rongqing wrote: > > > > > > > -----邮件原件----- > > > 发件人: linux-kernel-owner@xxxxxxxxxxxxxxx > > > [mailto:linux-kernel-owner@xxxxxxxxxxxxxxx] 代表 Greg KH > > > 发送时间: 2019年1月30日 18:19 > > > 收件人: Li,Rongqing <lirongqing@xxxxxxxxx> > > > 抄送: jslaby@xxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; > > > gkohli@xxxxxxxxxxxxxx > > > 主题: Re: [PATCH][v4] tty: fix race between flush_to_ldisc and > > > tty_open > > > > > > On Fri, Jan 18, 2019 at 05:27:17PM +0800, Li RongQing wrote: > > > > There still is a race window after the commit b027e2298bd588 > > > > ("tty: fix data race between tty_init_dev and flush of buf"), and > > > > we encountered this crash issue if receive_buf call comes before > > > > tty initialization completes in n_tty_open and > > > > tty->driver_data may be NULL. > > > > > > > > CPU0 CPU1 > > > > ---- ---- > > > > n_tty_open > > > > tty_init_dev > > > > tty_ldisc_unlock > > > > schedule flush_to_ldisc > > > > receive_buf > > > > tty_port_default_receive_buf > > > > tty_ldisc_receive_buf > > > > n_tty_receive_buf_common > > > > __receive_buf > > > > uart_flush_chars > > > > uart_start > > > > /*tty->driver_data is NULL*/ > > > > tty->ops->open > > > > /*init tty->driver_data*/ > > > > > > > > it can be fixed by extending ldisc semaphore lock in tty_init_dev > > > > to driver_data initialized completely after tty->ops->open(), but > > > > this will lead to put lock on one function and unlock in some > > > > other function, and hard to maintain, so fix this race only by > > > > checking > > > > tty->driver_data when receiving, and return if tty->driver_data > > > > is NULL > > > > > > > > Signed-off-by: Wang Li <wangli39@xxxxxxxxx> > > > > Signed-off-by: Zhang Yu <zhangyu31@xxxxxxxxx> > > > > Signed-off-by: Li RongQing <lirongqing@xxxxxxxxx> > > > > --- > > > > V4: add version information > > > > V3: not used ldisc semaphore lock, only checking tty->driver_data > > > > with NULL > > > > V2: fix building error by EXPORT_SYMBOL tty_ldisc_unlock > > > > V1: extend ldisc lock to protect that tty->driver_data is inited > > > > > > > > drivers/tty/tty_port.c | 3 +++ > > > > 1 file changed, 3 insertions(+) > > > > > > > > diff --git a/drivers/tty/tty_port.c b/drivers/tty/tty_port.c index > > > > 044c3cbdcfa4..86d0bec38322 100644 > > > > --- a/drivers/tty/tty_port.c > > > > +++ b/drivers/tty/tty_port.c > > > > @@ -31,6 +31,9 @@ static int tty_port_default_receive_buf(struct > > > > tty_port > > > *port, > > > > if (!tty) > > > > return 0; > > > > > > > > + if (!tty->driver_data) > > > > + return 0; > > > > + > > > > > > How is this working? What is setting driver_data to NULL to "stop" this > race? > > > > > > > > > if tty->driver_data is NULL and return, tty_port_default_receive_buf > > will not step to uart_start which access tty->driver_data and trigger > > panic before tty_open, so it can fix the system panic > > > > > There's no requirement that a tty driver set this field to NULL when it is > "done" > > > with the tty device, so I think you are just getting lucky in that > > > your specific driver happens to be doing this. > > > > > > > when tty_open is running, tty is allocated by kzalloc in tty_init_dev > > which called by tty_open_by_driver, tty is inited to 0 > > > > > What driver are you testing this against? > > > > > > > 8250 > > Ok, as this is specific to the uart core, how about this patch instead: > > diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c > index 5c01bb6d1c24..b56a6250df3f 100644 > --- a/drivers/tty/serial/serial_core.c > +++ b/drivers/tty/serial/serial_core.c > @@ -130,6 +130,9 @@ static void uart_start(struct tty_struct *tty) > struct uart_port *port; > unsigned long flags; > > + if (!state) > + return; > + > port = uart_port_lock(state, flags); > __uart_start(tty); > uart_port_unlock(port, flags); If move the check into uart_start, i am afraid that it maybe not fully fix this issue, Since n_tty_receive_buf_common maybe call n_tty_check_throttle/ tty_unthrottle_safe which maybe use the tty->driver_data if tty is not fully opened, I think no gain to step into more function thanks -RongQing