On Thu, Feb 17, 2022 at 03:39:48PM +0800, Jason Wang wrote: > On Thu, Feb 17, 2022 at 3:36 PM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: > > > > On Thu, Feb 17, 2022 at 03:34:13PM +0800, Jason Wang wrote: > > > On Thu, Feb 17, 2022 at 10:01 AM syzbot > > > <syzbot+1e3ea63db39f2b4440e0@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote: > > > > > > > > Hello, > > > > > > > > syzbot found the following issue on: > > > > > > > > HEAD commit: c5d9ae265b10 Merge tag 'for-linus' of git://git.kernel.org.. > > > > git tree: upstream > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=132e687c700000 > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=a78b064590b9f912 > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=1e3ea63db39f2b4440e0 > > > > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2 > > > > > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > > > Reported-by: syzbot+1e3ea63db39f2b4440e0@xxxxxxxxxxxxxxxxxxxxxxxxx > > > > > > > > WARNING: CPU: 1 PID: 10828 at drivers/vhost/vhost.c:715 vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715 > > > > Modules linked in: > > > > CPU: 0 PID: 10828 Comm: syz-executor.0 Not tainted 5.17.0-rc4-syzkaller-00051-gc5d9ae265b10 #0 > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > > > > RIP: 0010:vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715 > > > > > > Probably a hint that we are missing a flush. > > > > > > Looking at vhost_vsock_stop() that is called by vhost_vsock_dev_release(): > > > > > > static int vhost_vsock_stop(struct vhost_vsock *vsock) > > > { > > > size_t i; > > > int ret; > > > > > > mutex_lock(&vsock->dev.mutex); > > > > > > ret = vhost_dev_check_owner(&vsock->dev); > > > if (ret) > > > goto err; > > > > > > Where it could fail so the device is not actually stopped. > > > > > > I wonder if this is something related. > > > > > > Thanks > > > > > > But then if that is not the owner then no work should be running, right? > > Could it be a buggy user space that passes the fd to another process > and changes the owner just before the mutex_lock() above? > > Thanks Maybe, but can you be a bit more explicit? what is the set of conditions you see that can lead to this? > > > > > > > > > > > Code: c7 85 90 01 00 00 00 00 00 00 e8 53 6e a2 fa 48 89 ef 48 83 c4 20 5b 5d 41 5c 41 5d 41 5e 41 5f e9 7d d6 ff ff e8 38 6e a2 fa <0f> 0b e9 46 ff ff ff 48 8b 7c 24 10 e8 87 00 ea fa e9 75 f7 ff ff > > > > RSP: 0018:ffffc9000fe6fa18 EFLAGS: 00010293 > > > > RAX: 0000000000000000 RBX: dffffc0000000000 RCX: 0000000000000000 > > > > RDX: ffff888021b63a00 RSI: ffffffff86d66fe8 RDI: ffff88801cc200b0 > > > > RBP: ffff88801cc20000 R08: 0000000000000001 R09: 0000000000000001 > > > > R10: ffffffff817f1e08 R11: 0000000000000000 R12: ffff88801cc200d0 > > > > R13: ffff88801cc20120 R14: ffff88801cc200d0 R15: 0000000000000002 > > > > FS: 0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > CR2: 0000001b2de25000 CR3: 000000004c9cd000 CR4: 00000000003506f0 > > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > > Call Trace: > > > > <TASK> > > > > vhost_vsock_dev_release+0x36e/0x4b0 drivers/vhost/vsock.c:771 > > > > __fput+0x286/0x9f0 fs/file_table.c:313 > > > > task_work_run+0xdd/0x1a0 kernel/task_work.c:164 > > > > exit_task_work include/linux/task_work.h:32 [inline] > > > > do_exit+0xb29/0x2a30 kernel/exit.c:806 > > > > do_group_exit+0xd2/0x2f0 kernel/exit.c:935 > > > > get_signal+0x45a/0x2490 kernel/signal.c:2863 > > > > arch_do_signal_or_restart+0x2a9/0x1c40 arch/x86/kernel/signal.c:868 > > > > handle_signal_work kernel/entry/common.c:148 [inline] > > > > exit_to_user_mode_loop kernel/entry/common.c:172 [inline] > > > > exit_to_user_mode_prepare+0x17d/0x290 kernel/entry/common.c:207 > > > > __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline] > > > > syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:300 > > > > do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86 > > > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > > > RIP: 0033:0x7f4027a46481 > > > > Code: Unable to access opcode bytes at RIP 0x7f4027a46457. > > > > RSP: 002b:00007f402808ba68 EFLAGS: 00000206 ORIG_RAX: 0000000000000038 > > > > RAX: fffffffffffffffc RBX: 00007f402622e700 RCX: 00007f4027a46481 > > > > RDX: 00007f402622e9d0 RSI: 00007f402622e2f0 RDI: 00000000003d0f00 > > > > RBP: 00007f402808bcb0 R08: 00007f402622e700 R09: 00007f402622e700 > > > > R10: 00007f402622e9d0 R11: 0000000000000206 R12: 00007f402808bb1e > > > > R13: 00007f402808bb1f R14: 00007f402622e300 R15: 0000000000022000 > > > > </TASK> > > > > > > > > > > > > --- > > > > This report is generated by a bot. It may contain errors. > > > > See https://goo.gl/tpsmEJ for more information about syzbot. > > > > syzbot engineers can be reached at syzkaller@xxxxxxxxxxxxxxxx. > > > > > > > > syzbot will keep track of this issue. See: > > > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > > > > > >