On 2022/05/04 13:58, Tetsuo Handa wrote: > On 2022/05/04 12:09, Eric Dumazet wrote: >> This exit() handler _has_ to remove all known listeners, and >> definitely cancel work queues (synchronous operation) >> before the actual "struct net" free can happen later. > > But in your report, rds_tcp_tune() is called from rds_tcp_conn_path_connect() from > rds_connect_worker() via "struct rds_connection"->cp_conn_w work. I can see that > rds_tcp_kill_sock() calls rds_tcp_listen_stop(lsock, &rtn->rds_tcp_accept_w), and > rds_tcp_listen_stop() calls flush_workqueue(rds_wq) and flush_work(&rtn->rds_tcp_accept_w). > > But I can't see how rds_tcp_exit_net() synchronously cancels all works associated > with "struct rds_conn_path". > > struct rds_conn_path { > struct delayed_work cp_send_w; > struct delayed_work cp_recv_w; > struct delayed_work cp_conn_w; > struct work_struct cp_down_w; > } > > These works are queued to rds_wq, but flush_workqueue() waits for completion only > if already queued. What if timer for queue_delayed_work() has not expired, or was > about to call queue_delayed_work() ? Is flush_workqueue(rds_wq) sufficient? rds_tcp_tune+0x5a0/0x5f0 net/rds/tcp.c:503 rds_tcp_conn_path_connect+0x489/0x880 net/rds/tcp_connect.c:127 rds_connect_worker+0x1a5/0x2c0 net/rds/threads.c:176 process_one_work+0x996/0x1610 kernel/workqueue.c:2289 rds_tcp_conn_path_connect is referenced by "struct rds_transport rds_tcp_transport"->conn_path_connect. It is invoked by ret = conn->c_trans->conn_path_connect(cp) in rds_connect_worker(). rds_connect_worker is referenced by "struct rds_conn_path"->cp_conn_w via INIT_DELAYED_WORK(). queue_delayed_work(rds_wq, &cp->cp_conn_w, *) is called by rds_queue_reconnect() or rds_conn_path_connect_if_down(). If rds_conn_path_connect_if_down() were called from rds_tcp_accept_one_path() from rds_tcp_accept_one(), rds_tcp_tune() from rds_tcp_accept_one() was already called before rds_tcp_tune() from rds_tcp_conn_path_connect() is called. Since the addition on 0 was not reported at rds_tcp_tune() from rds_tcp_accept_one(), what Eric is reporting cannot be from rds_tcp_accept_one() from rds_tcp_accept_worker(). Despite rds_tcp_kill_sock() sets rtn->rds_tcp_listen_sock = NULL and waits for rds_tcp_accept_one() from rds_tcp_accept_worker() to complete using flush_workqueue(rds_wq), what Eric is reporting is different from what syzbot+694120e1002c117747ed was reporting. > > Anyway, if rds_tcp_kill_sock() can somehow guarantee that all works are completed > or cancelled, the fix would look like something below? I think it is OK to apply below diff in order to avoid addition on 0 problem, but it is not proven that kmem_cache_free() is not yet called. What should we do? > > net/rds/tcp.c | 11 ++++++++--- > net/rds/tcp.h | 2 +- > net/rds/tcp_connect.c | 5 ++++- > net/rds/tcp_listen.c | 5 ++++- > 4 files changed, 17 insertions(+), 6 deletions(-) >