On Wed, Nov 20, 2019 at 03:17:09PM +0000, Sudip Mukherjee wrote: > There seems to be a race condition in tty drivers and I could see on > many boot cycles a NULL pointer dereference as tty_init_dev() tries to > do 'tty->port->itty = tty' even though tty->port is NULL. > 'tty->port' will be set by the driver and if the driver has not yet done > it before we open the tty device we can get to this situation. By adding > some extra debug prints, I noticed that tty_port_link_device() is > initialising 'driver->ports[index]' just few microseconds after I > get the warning. > So, add one retry so that tty_init_dev() will return -EAGAIN on its first > try if 'tty->port' is not set yet, and then tty_open() will try to open > it again. > > Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@xxxxxxxxx> > --- > drivers/tty/pty.c | 2 +- > drivers/tty/serdev/serdev-ttyport.c | 2 +- > drivers/tty/tty_io.c | 20 ++++++++++++++------ > include/linux/tty.h | 3 ++- > 4 files changed, 18 insertions(+), 9 deletions(-) > > diff --git a/drivers/tty/pty.c b/drivers/tty/pty.c > index 00099a8439d2..22e8c40d9f9c 100644 > --- a/drivers/tty/pty.c > +++ b/drivers/tty/pty.c > @@ -842,7 +842,7 @@ static int ptmx_open(struct inode *inode, struct file *filp) > > > mutex_lock(&tty_mutex); > - tty = tty_init_dev(ptm_driver, index); > + tty = tty_init_dev(ptm_driver, index, 0); Horrible naming scheme for this new "flag". Look at that call here, can you instantly tell what this call is doing with "0"? I sure can not :( If you really want to do this, you make a different function, tty_init_dev_retry() and then have that pass in a retry flag in the tty core, so that any users always know what they are doing here. But, this really feels like a race in the code somewhere: > --- a/drivers/tty/tty_io.c > +++ b/drivers/tty/tty_io.c > @@ -1295,6 +1295,7 @@ static int tty_reopen(struct tty_struct *tty) > * tty_init_dev - initialise a tty device > * @driver: tty driver we are opening a device on > * @idx: device index > + * @retry: retry count if driver has not set tty->port yet Why would tty->port not be set up already? The caller has control over this, what is not happening correctly to cause this? thanks, greg k-h