On 17/12/2020 03:17, Hillf Danton wrote: > Wed, 16 Dec 2020 13:14:11 -0800 >> syzbot found the following issue on: >> >> HEAD commit: 7b1b868e Merge tag 'for-linus' of git://git.kernel.org/pub.. >> git tree: upstream >> console output: https://syzkaller.appspot.com/x/log.txt?x=1156046b500000 >> kernel config: https://syzkaller.appspot.com/x/.config?x=3416bb960d5c705d >> dashboard link: https://syzkaller.appspot.com/bug?extid=c9937dfb2303a5f18640 >> compiler: gcc (GCC) 10.1.0-syz 20200507 >> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1407c287500000 >> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10ed5f07500000 >> >> The issue was bisected to: >> >> commit 4d004099a668c41522242aa146a38cc4eb59cb1e >> Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx> >> Date: Fri Oct 2 09:04:21 2020 +0000 >> >> lockdep: Fix lockdep recursion >> >> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=14e9d433500000 >> final oops: https://syzkaller.appspot.com/x/report.txt?x=16e9d433500000 >> console output: https://syzkaller.appspot.com/x/log.txt?x=12e9d433500000 >> >> IMPORTANT: if you fix the issue, please add the following tag to the commit: >> Reported-by: syzbot+c9937dfb2303a5f18640@xxxxxxxxxxxxxxxxxxxxxxxxx >> Fixes: 4d004099a668 ("lockdep: Fix lockdep recursion") >> >> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441309 >> RDX: 0000000000000002 RSI: 00000000200000c0 RDI: 0000000000003ad1 >> RBP: 000000000000f2ae R08: 0000000000000002 R09: 00000000004002c8 >> R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004021d0 >> R13: 0000000000402260 R14: 0000000000000000 R15: 0000000000000000 >> ------------[ cut here ]------------ >> percpu_ref_kill_and_confirm called more than once on io_ring_ctx_ref_free! >> WARNING: CPU: 0 PID: 8476 at lib/percpu-refcount.c:382 percpu_ref_kill_and_confirm+0x126/0x180 lib/percpu-refcount.c:382 >> Modules linked in: >> CPU: 0 PID: 8476 Comm: syz-executor389 Not tainted 5.10.0-rc7-syzkaller #0 >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >> RIP: 0010:percpu_ref_kill_and_confirm+0x126/0x180 lib/percpu-refcount.c:382 >> Code: 5d 08 48 8d 7b 08 48 89 fa 48 c1 ea 03 80 3c 02 00 75 5d 48 8b 53 08 48 c7 c6 00 4b 9d 89 48 c7 c7 60 4a 9d 89 e8 c6 97 f6 04 <0f> 0b 48 b8 00 00 00 00 00 fc ff df 48 89 ea 48 c1 ea 03 80 3c 02 >> RSP: 0018:ffffc9000b94fe10 EFLAGS: 00010086 >> RAX: 0000000000000000 RBX: ffff888011da4580 RCX: 0000000000000000 >> RDX: ffff88801fe84ec0 RSI: ffffffff8158c835 RDI: fffff52001729fb4 >> RBP: ffff88801539f000 R08: 0000000000000001 R09: ffff8880b9e2011b >> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000293 >> R13: 0000000000000000 R14: 0000000000000000 R15: ffff88802de28758 >> FS: 00000000014ab880(0000) GS:ffff8880b9e00000(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 00007f2a7046b000 CR3: 0000000023368000 CR4: 0000000000350ef0 >> Call Trace: >> percpu_ref_kill include/linux/percpu-refcount.h:149 [inline] >> io_ring_ctx_wait_and_kill+0x2b/0x450 fs/io_uring.c:8382 >> io_uring_release+0x3e/0x50 fs/io_uring.c:8420 >> __fput+0x285/0x920 fs/file_table.c:281 >> task_work_run+0xdd/0x190 kernel/task_work.c:151 >> tracehook_notify_resume include/linux/tracehook.h:188 [inline] >> exit_to_user_mode_loop kernel/entry/common.c:164 [inline] >> exit_to_user_mode_prepare+0x17e/0x1a0 kernel/entry/common.c:191 >> syscall_exit_to_user_mode+0x38/0x260 kernel/entry/common.c:266 >> entry_SYSCALL_64_after_hwframe+0x44/0xa9 >> RIP: 0033:0x441309 >> Code: e8 5c ae 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b 0a fc ff c3 66 2e 0f 1f 84 00 00 00 00 >> RSP: 002b:00007ffed6545d38 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9 >> RAX: fffffffffffffff4 RBX: 0000000000000000 RCX: 0000000000441309 >> RDX: 0000000000000002 RSI: 00000000200000c0 RDI: 0000000000003ad1 >> RBP: 000000000000f2ae R08: 0000000000000002 R09: 00000000004002c8 >> R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004021d0 >> R13: 0000000000402260 R14: 0000000000000000 R15: 0000000000000000 > > Avoid double kill by checking ctx health. Let's focus on _how_ it can happen. Refs may be killed by __io_uring_register(), but this one holds a ref to the file, so io_uring_release() -> io_ring_ctx_wait_and_kill() shouldn't even happen. And when io_ring_ctx_wait_and_kill() is called fdget() for that ring wouldn't be possible. That's if no other bugs are present. We want to solve a problem rather than mask it. So, can it really happen or a problem is somewhere else? > > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -8379,7 +8379,13 @@ static void io_ring_exit_work(struct wor > static void io_ring_ctx_wait_and_kill(struct io_ring_ctx *ctx) > { > mutex_lock(&ctx->uring_lock); > - percpu_ref_kill(&ctx->refs); > + /* > + * try to avoid killing dead ctx, see the comments for dropping > + * ring mutex in __io_uring_register() > + */ > + if (!percpu_ref_is_dying(&ctx->refs)) > + percpu_ref_kill(&ctx->refs); > + > mutex_unlock(&ctx->uring_lock); > > io_kill_timeouts(ctx, NULL); > -- Pavel Begunkov