On 08/29/2018, 04:40 PM, Jiri Slaby wrote: > On 08/29/2018, 04:23 AM, Dmitry Safonov wrote: >> tty_ldisc_reinit() doesn't race with neither tty_ldisc_hangup() >> nor set_ldisc() nor tty_ldisc_release() as they use tty lock. >> But it races with anyone who expects line discipline to be the same >> after hoding read semaphore in tty_ldisc_ref(). >> >> We've seen the following crash on v4.9.108 stable: >> >> BUG: unable to handle kernel paging request at 0000000000002260 >> IP: [..] n_tty_receive_buf_common+0x5f/0x86d >> Workqueue: events_unbound flush_to_ldisc >> Call Trace: >> [..] n_tty_receive_buf2 >> [..] tty_ldisc_receive_buf >> [..] flush_to_ldisc >> [..] process_one_work >> [..] worker_thread >> [..] kthread >> [..] ret_from_fork >> >> I think, tty_ldisc_reinit() should be called with ldisc_sem hold for >> writing, which will protect any reader against line discipline changes. >> >> Note: I failed to reproduce the described crash, so obiviously can't >> guarantee that this is the place where line discipline was switched. ... > So what about: > tty_ldisc_lock(tty, MAX_SCHEDULE_TIMEOUT); > if (!tty->ldisc) > ret = tty_ldisc_reinit(tty, tty->termios.c_line); > tty_ldisc_unlock(tty); > > if (!ret) > tty->count++; > > return ret; I forgot to add that I debugged a different NULL ptr deref to this very same root cause today (set_termios called with NULL tty->disc_data). So really, tty_reinit's ldisc change must be protected by the ldisc_sem, otherwise other threads will see tty->ldisc, but not tty->disc_data. thanks, -- js suse labs