On Mon, Jan 4, 2021 at 12:22 PM Hillf Danton <hdanton@xxxxxxxx> wrote: > It is now updated. Hello Hilf, Thanks for the new diff. I tested by applying the diff on 5.10.4 with the original reproducer, and the issue still persists. root@syzkaller:~# [ 242.925799] INFO: task repro:416 blocked for more than 120 seconds. [ 242.928095] Not tainted 5.10.4+ #12 [ 242.929034] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 242.930825] task:repro state:D stack: 0 pid: 416 ppid: 415 flags:0x00000004 [ 242.933404] Call Trace: [ 242.934365] __schedule+0x28d/0x7e0 [ 242.935199] ? __percpu_counter_sum+0x75/0x90 [ 242.936265] schedule+0x4f/0xc0 [ 242.937159] __io_uring_task_cancel+0xc0/0xf0 [ 242.938340] ? wait_woken+0x80/0x80 [ 242.939380] bprm_execve+0x67/0x8a0 [ 242.940163] do_execveat_common+0x1d2/0x220 [ 242.941090] __x64_sys_execveat+0x5d/0x70 [ 242.942056] do_syscall_64+0x38/0x90 [ 242.943088] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 242.944511] RIP: 0033:0x7fd0b781e469 [ 242.945422] RSP: 002b:00007fffda20e9c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000142 [ 242.947289] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd0b781e469 [ 242.949031] RDX: 0000000000000000 RSI: 0000000020000180 RDI: 00000000ffffffff [ 242.950683] RBP: 00007fffda20e9e0 R08: 0000000000000000 R09: 00007fffda20e9e0 [ 242.952450] R10: 0000000000000000 R11: 0000000000000246 R12: 0000556068200bf0 [ 242.954045] R13: 00007fffda20eb00 R14: 0000000000000000 R15: 0000000000000000 linux git:(b1313fe517ca) git diff diff --git a/fs/io_uring.c b/fs/io_uring.c index 0fcd065baa76..e0c5424e28b1 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -1867,8 +1867,7 @@ static void __io_free_req(struct io_kiocb *req) io_dismantle_req(req); percpu_counter_dec(&tctx->inflight); - if (atomic_read(&tctx->in_idle)) - wake_up(&tctx->wait); + wake_up(&tctx->wait); put_task_struct(req->task); if (likely(!io_is_fallback_req(req))) @@ -8853,12 +8852,11 @@ void __io_uring_task_cancel(void) * If we've seen completions, retry. This avoids a race where * a completion comes in before we did prepare_to_wait(). */ - if (inflight != tctx_inflight(tctx)) - continue; - schedule(); + if (inflight == tctx_inflight(tctx)) + schedule(); + finish_wait(&tctx->wait, &wait); } while (1); - finish_wait(&tctx->wait, &wait); atomic_dec(&tctx->in_idle); }