On 2018/07/18 22:04, Dmitry Vyukov wrote: > On Wed, Jul 18, 2018 at 2:53 PM, Tetsuo Handa > <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote: >> On 2018/07/18 20:41, Dmitry Vyukov wrote: >>> This seems to be related to 9p. After rerunning the log I got: >>> >>> root@syzkaller:~# ps afxu | grep syz >>> root 18253 0.0 0.0 0 0 ttyS0 Zl 10:16 0:00 \_ >>> [syz-executor] <defunct> >>> root@syzkaller:~# cat /proc/18253/task/*/stack >>> [<0>] p9_client_rpc+0x3a2/0x1400 >>> [<0>] p9_client_flush+0x134/0x2a0 >>> [<0>] p9_client_rpc+0x122c/0x1400 >>> [<0>] p9_client_create+0xc56/0x16af >>> [<0>] v9fs_session_init+0x21a/0x1a80 >>> [<0>] v9fs_mount+0x7c/0x900 >>> [<0>] mount_fs+0xae/0x328 >>> [<0>] vfs_kern_mount.part.34+0xdc/0x4e0 >>> [<0>] do_mount+0x581/0x30e0 >>> [<0>] ksys_mount+0x12d/0x140 >>> [<0>] __x64_sys_mount+0xbe/0x150 >>> [<0>] do_syscall_64+0x1b9/0x820 >>> [<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe >>> [<0>] 0xffffffffffffffff >>> >>> There is a bunch of hangs in 9p, so let's do: >>> >>> #syz dup: INFO: task hung in flush_work >>> >> Then, is dumping all threads when khungtaskd fires a candidate >> for CONFIG_DEBUG_AID_FOR_SYZBOT=y path? > > Perhaps would be useful. But maybe only tasks that are blocked for > more than timeout/2? and/or unkillable tasks? killable tasks are not a > problem. TASK_KILLABLE waiters are not reported by khungtaskd, are they? /* use "==" to skip the TASK_KILLABLE tasks waiting on NFS */ if (t->state == TASK_UNINTERRUPTIBLE) check_hung_task(t, timeout); And TASK_KILLABLE waiters can become a problem because > > Btw, I see that p9_client_rpc uses wait_event_killable, why wasn't it > killed along with the whole process? > wait_event_killable() would return -ERESTARTSYS if got SIGKILL. But if (c->status == Connected) && (type == P9_TFLUSH) is also true, it ignores SIGKILL by retrying the loop... again: err = wait_event_killable(*req->wq, req->status >= REQ_STATUS_RCVD); if ((err == -ERESTARTSYS) && (c->status == Connected) && (type == P9_TFLUSH)) { sigpending = 1; clear_thread_flag(TIF_SIGPENDING); goto again; } I wish they don't ignore SIGKILL (by e.g. offloading operations to a kernel thread).