When I use syzkaller test kernel, will hung in exit_aio. INFO: task syz-executor.2:22372 blocked for more than 140 seconds. Not tainted 4.19.25 #5 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. syz-executor.2 D27568 22372 2689 0x90000002 Call Trace: schedule+0x7c/0x1a0 kernel/sched/core.c:3516 schedule_timeout+0x4cf/0x1140 kernel/time/timer.c:1780 do_wait_for_common kernel/sched/completion.c:83 [inline] __wait_for_common kernel/sched/completion.c:104 [inline] wait_for_common kernel/sched/completion.c:115 [inline] wait_for_completion+0x27a/0x3d0 kernel/sched/completion.c:136 exit_aio+0x2ef/0x3c0 fs/aio.c:881 __mmput kernel/fork.c:1047 [inline] mmput+0xb4/0x460 kernel/fork.c:1071 exit_mm kernel/exit.c:545 [inline] do_exit+0x79c/0x2cb0 kernel/exit.c:862 do_group_exit+0x106/0x2f0 kernel/exit.c:978 get_signal+0x325/0x1c80 kernel/signal.c:2572 do_signal+0x94/0x16a0 arch/x86/kernel/signal.c:816 exit_to_usermode_loop+0x108/0x1d0 arch/x86/entry/common.c:162 prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline] syscall_return_slowpath arch/x86/entry/common.c:268 [inline] do_syscall_64+0x461/0x580 arch/x86/entry/common.c:293 The reason is as follows: io_submit_one-->aio_get_req-->percpu_ref_get(&ctx->reqs) -->req->ki_refcnt=0 -->aio_poll-->req->ki_refcnt=2 -->aio_poll_complete-->aio_complete-->iocb_put -->iocb_put iocb_put will decrease req->ki_refcnt, the number of calls of aio_poll_complete must be equal with iocb_put. Unfortunately, in some case, this is not equal, which is as follows: CPU 0 CPU 1 aio_poll-->vfs_poll eventfd_write-->spin_lock_irq(lock) -->..-->aio_poll_wake -->spin_unlock_irq(lock) -->spin_lock(lock) -->if (req->woken) mask = 0; --->did not call aio_poll_complete -->iocb_put aio_poll_wake req->woken = true; if (mask) { if (!(mask & req->events)) return 0; --->did not call aio_poll_complete too vfs_poll-->eventfd_poll-->poll_wait-->aio_poll_queue_proc(add aio_poll_wake to req->head) eventfd_write-->wake_up_locked_poll-->__wake_up_common-->curr->func -->aio_poll_wake This patch fixes that. by the way, fix the bug of the error handling path. Signed-off-by: zhengbin <zhengbin13@xxxxxxxxxx> --- fs/aio.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 38b741a..3bf8cdc 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1668,8 +1668,6 @@ static int aio_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync, __poll_t mask = key_to_poll(key); unsigned long flags; - req->woken = true; - /* for instances that support it check for an event match first: */ if (mask) { if (!(mask & req->events)) @@ -1687,12 +1685,14 @@ static int aio_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync, list_del_init(&req->wait.entry); aio_poll_complete(iocb, mask); + req->woken = true; return 1; } } list_del_init(&req->wait.entry); schedule_work(&req->work); + req->woken = true; return 1; } @@ -1777,8 +1777,10 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb) spin_unlock_irq(&ctx->ctx_lock); out: - if (unlikely(apt.error)) + if (unlikely(apt.error)) { + iocb_put(aiocb); return apt.error; + } if (mask) aio_poll_complete(aiocb, mask); -- 2.7.4