On Wed, Jan 30, 2019 at 12:48:42PM +0000, Li,Rongqing wrote: > > > > -----邮件原件----- > > 发件人: linux-kernel-owner@xxxxxxxxxxxxxxx > > [mailto:linux-kernel-owner@xxxxxxxxxxxxxxx] 代表 Greg KH > > 发送时间: 2019年1月30日 18:19 > > 收件人: Li,Rongqing <lirongqing@xxxxxxxxx> > > 抄送: jslaby@xxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; gkohli@xxxxxxxxxxxxxx > > 主题: Re: [PATCH][v4] tty: fix race between flush_to_ldisc and tty_open > > > > On Fri, Jan 18, 2019 at 05:27:17PM +0800, Li RongQing wrote: > > > There still is a race window after the commit b027e2298bd588 > > > ("tty: fix data race between tty_init_dev and flush of buf"), and we > > > encountered this crash issue if receive_buf call comes before tty > > > initialization completes in n_tty_open and > > > tty->driver_data may be NULL. > > > > > > CPU0 CPU1 > > > ---- ---- > > > n_tty_open > > > tty_init_dev > > > tty_ldisc_unlock > > > schedule flush_to_ldisc > > > receive_buf > > > tty_port_default_receive_buf > > > tty_ldisc_receive_buf > > > n_tty_receive_buf_common > > > __receive_buf > > > uart_flush_chars > > > uart_start > > > /*tty->driver_data is NULL*/ > > > tty->ops->open > > > /*init tty->driver_data*/ > > > > > > it can be fixed by extending ldisc semaphore lock in tty_init_dev to > > > driver_data initialized completely after tty->ops->open(), but this > > > will lead to put lock on one function and unlock in some other > > > function, and hard to maintain, so fix this race only by checking > > > tty->driver_data when receiving, and return if tty->driver_data > > > is NULL > > > > > > Signed-off-by: Wang Li <wangli39@xxxxxxxxx> > > > Signed-off-by: Zhang Yu <zhangyu31@xxxxxxxxx> > > > Signed-off-by: Li RongQing <lirongqing@xxxxxxxxx> > > > --- > > > V4: add version information > > > V3: not used ldisc semaphore lock, only checking tty->driver_data with > > > NULL > > > V2: fix building error by EXPORT_SYMBOL tty_ldisc_unlock > > > V1: extend ldisc lock to protect that tty->driver_data is inited > > > > > > drivers/tty/tty_port.c | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > diff --git a/drivers/tty/tty_port.c b/drivers/tty/tty_port.c index > > > 044c3cbdcfa4..86d0bec38322 100644 > > > --- a/drivers/tty/tty_port.c > > > +++ b/drivers/tty/tty_port.c > > > @@ -31,6 +31,9 @@ static int tty_port_default_receive_buf(struct tty_port > > *port, > > > if (!tty) > > > return 0; > > > > > > + if (!tty->driver_data) > > > + return 0; > > > + > > > > How is this working? What is setting driver_data to NULL to "stop" this race? > > > > > if tty->driver_data is NULL and return, tty_port_default_receive_buf will not step to > uart_start which access tty->driver_data and trigger panic before tty_open, so it can > fix the system panic > > > There's no requirement that a tty driver set this field to NULL when it is "done" > > with the tty device, so I think you are just getting lucky in that your specific > > driver happens to be doing this. > > > > when tty_open is running, tty is allocated by kzalloc in tty_init_dev which called > by tty_open_by_driver, tty is inited to 0 > > > What driver are you testing this against? > > > > 8250 Ok, as this is specific to the uart core, how about this patch instead: diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c index 5c01bb6d1c24..b56a6250df3f 100644 --- a/drivers/tty/serial/serial_core.c +++ b/drivers/tty/serial/serial_core.c @@ -130,6 +130,9 @@ static void uart_start(struct tty_struct *tty) struct uart_port *port; unsigned long flags; + if (!state) + return; + port = uart_port_lock(state, flags); __uart_start(tty); uart_port_unlock(port, flags);